Identifying presence and composition of cell-free nucleic acids

ABSTRACT

This disclosure describes example techniques and systems for identifying the presence and/or composition of nucleic acids in the blood of a host organism of a model species harboring tissue of a donor organism of another species. For example, the technique may involve identifying the presence and composition of nucleic acids in the blood of a mouse harboring tissue of a human or another companion animal. These cell-free nucleic acids that are identified can be used as biomarkers to determine the presence of a disease, its biological behavior, its rate of progression, and/or the response of the disease to one or more unique therapies. In other examples, the cell-free nucleic acids may be used as biomarkers to determine a response of the host species to the tissue of the donor organism or a response of tissue derived from the second organism to transplantation within the first organism of the first species.

This application is a continuation of U.S. patent application Ser. No.15/783,776, filed Oct. 13, 2017, which claims the benefit of U.S.Provisional Patent Application No. 62/407,987, filed Oct. 13, 2016, theentire content of which being incorporated herein by reference.

TECHNICAL FIELD

This disclosure relates generally to methods for identifying andanalyzing nucleic acids and, more particularly, methods for identifyingand analyzing cell-free nucleic-acid biomarkers.

BACKGROUND

The composition and abundance of an organism's nucleic acids providebiomarkers indicative of various aspects of the organism's genome andtranscriptional expression, including the organism's predispositiontoward particular biological states, as well as the presence andprogression of such biological states. Much of a living, multicellularorganism's total nucleic acid complement is located intracellularly: DNAis chiefly located within the nuclei of the cells, whereas RNA ofnumerous types is abundant within the various organelles and cytoplasmof cells. Thus, nucleic acids may be derived from cells and used asbiomarkers to determine a biological state of organism, such as thepresence of a disease or the biological behavior of the disease. Inapplications such as xenografting, tissue from a donor animal may begrafted into a host animal, and then the biological behavior of thedonor tissue may be evaluated in the host animal by analyzingnucleic-acid biomarkers derived from cells of the donor tissue.

SUMMARY

This disclosure describes example techniques and systems for determiningthe composition and abundance of nucleic acids in the blood of a hostanimal that has received a tissue xenograft and detecting biomarkers ofa biological state via identification of the nucleic acids. Suchtechniques may include creating a combined reference genome thatincorporates gene sequences from both the genome of the host animal andthe genome of the xenograft donor animal. Cell-free nucleic acidsequences isolated from a blood sample of the host animal may then bealigned with the combined reference genome. In order to distinguishbetween sequences originating from the donor animal and thoseoriginating from the host animal, those sequences that align with asingle region of the combined reference genome may be retained forfurther analysis. The retained sequences may then be analyzed todetermine their identity, species of origin, abundance, and associationwith predetermined gene clusters that represent known biochemicalpathways. In some examples, the techniques described herein may enableaccurate identification and analysis of biomarkers associated with abiological state of xenograft donor tissue or xenograft host tissuebased on a blood sample of the host animal, in part by eliminating fromconsideration confounding sequences originating from one of the hostanimal or the donor animal.

In one example, a method comprises obtaining a plurality of exosomesfrom a sample of bodily fluid derived from a first organism of a firstspecies, wherein the first organism of the first species comprisestissue derived from both the first organism of the first species andtissue derived from a second organism of a second species, and whereinthe plurality of exosomes comprises a plurality of molecules ofribonucleic acid (RNA); determining, for substantially each molecule ofthe plurality of molecules of RNA, a corresponding RNA sequence;determining, for each corresponding RNA sequence, whether the RNAsequence is substantially aligned with exactly one corresponding genesequence of a combined reference genome; determining one or morecharacteristics of each RNA sequence substantially aligned with exactlyone corresponding gene sequence of the combined reference genome,wherein the one or more characteristics include at least one of: a genename of the corresponding gene sequence; a species associated with thecorresponding gene sequence, wherein the species is one of the firstspecies and the second species; determining an approximate number oftimes that each RNA sequence substantially aligned with exactly onecorresponding gene occurs in the sample of blood; and determining, basedon one or more of the one or more characteristics of each RNA sequencesubstantially aligned with exactly one corresponding gene sequence ofthe combined reference genome or the approximate number of times eachRNA sequence substantially aligned with exactly one corresponding geneoccurs in the sample of bodily fluid, whether the tissue derived fromthe second organism of the second species contains a biomarkerindicative of at least one of a disease status, a response of the firstorganism of the first species to the tissue derived from the secondorganism of the second species, or a response of tissue derived from thesecond organism to transplantation within the organism of the firstspecies.

In another example, a method comprises determining a corresponding RNAsequence for substantially each molecule of a plurality of molecules ofribonucleic acid (RNA), wherein a plurality of exosomes from a sample ofbodily fluid derived from a first organism of a first species comprisesthe plurality of molecules of RNA, and wherein the first organism of thefirst species comprises tissue derived from both the first organism ofthe first species and tissue derived from a second organism of a secondspecies; determining, for each corresponding RNA sequence, whether theRNA sequence is substantially aligned with exactly one correspondinggene sequence of a combined reference genome; determining one or morecharacteristics of each RNA sequence substantially aligned with exactlyone corresponding gene sequence of the combined reference genome; anddetermining an approximate number of times that each RNA sequencesubstantially aligned with exactly one corresponding gene occurs in thesample of bodily fluid; and determining, based on one or more of the oneor more characteristics of each RNA sequence substantially aligned withexactly one corresponding gene sequence of the combined reference genomeor the approximate number of times each RNA sequence substantiallyaligned with exactly one corresponding gene occurs in the sample ofbodily fluid, whether the tissue derived from the second organism of thesecond species contains a biomarker indicative of at least one of adisease status, a response of the first organism of the first species tothe tissue derived from the second organism of the second species, or aresponse of tissue derived from the second organism to transplantationwithin the first organism of the first species.

In another example, a system comprises a reservoir configured to receivea sample of bodily fluid; and processing circuitry configured to:determine a corresponding RNA sequence for substantially each moleculeof a plurality of molecules of ribonucleic acid (RNA), wherein aplurality of exosomes from the sample of bodily fluid is derived from afirst organism of a first species and comprises the plurality ofmolecules of RNA, and wherein the first organism of the first speciescomprises tissue derived from both the first organism of the firstspecies and tissue derived from a second organism of a second species;determine, for each corresponding RNA sequence, whether the RNA sequenceis substantially aligned with exactly one corresponding gene sequence ofa combined reference genome; determine one or more characteristics ofeach RNA sequence substantially aligned with exactly one correspondinggene sequence of the combined reference genome; determine an approximatenumber of times that each RNA sequence substantially aligned withexactly one corresponding gene occurs in the sample of bodily fluid; anddetermine, based on one or more of the one or more characteristics ofeach RNA sequence substantially aligned with exactly one correspondinggene sequence of the combined reference genome or the approximate numberof times each RNA sequence substantially aligned with exactly onecorresponding gene occurs in the sample of bodily fluid, whether thetissue derived from the second organism of the second species contains abiomarker indicative of at least one of a disease status, a response ofthe first organism of the first species to the tissue derived from thesecond organism of the second species, or a response of tissue derivedfrom the second organism to transplantation within the first organism ofthe first species.

The details of one or more example are set forth in the accompanyingdrawings and the description below. Other features, objects, andadvantages will be apparent from the description and drawings, and fromthe claims.

BRIEF DESCRIPTION OF DRAWINGS

FIG. 1 is a flow diagram illustrating an example technique in accordancewith the examples of this disclosure.

FIG. 2 is a flow diagram illustrating an example technique in accordancewith the examples of this disclosure.

FIG. 3 is flow diagram illustrating an example technique in accordancewith the examples of this disclosure.

FIG. 4 is functional block diagram illustrating an example system thatmay be used to implement the techniques described herein, which mayinclude remote computing devices, such as a server and one or more othercomputing devices, that are connected to one or more external devicesvia a network.

FIG. 5 is a functional block diagram further illustrating the externalserver in the example system of FIG. 4 that may be used to implement thetechniques described herein.

FIGS. 6A and 6B are graphical representations of differences in geneexpression and patient survival times between OS-1 and OS-2 phenotypes.

FIGS. 7A and 7B are photographic representations of data pertaining tothe application of the techniques described herein to the OS-1/OS-2xenograft example.

FIGS. 8A-8C are photographic and graphical representations of datapertaining to the application of the techniques described herein to theOS-1/OS-2 xenograft example.

FIGS. 9A-9J are graphical representations of data pertaining to theapplication of the techniques described herein to the OS-1/OS-2xenograft example.

FIGS. 10A and 10B are graphical representations of a data analysistechnique in accordance with the examples of this disclosure, as appliedto the OS-1/OS-2 xenograft example.

FIG. 11 is a graphical representation of a data analysis technique inaccordance with the examples of this disclosure, as applied to theOS-1/OS-2 xenograft example.

FIG. 12 is a graphical representation of a data analysis technique inaccordance with the examples of this disclosure, as applied to theOS-1/OS-2 xenograft example.

FIG. 13 is a graphical representation of a data analysis technique inaccordance with the examples of this disclosure, as applied to theOS-1/OS-2 xenograft example.

FIG. 14 is a graphical representation of a data analysis technique inaccordance with the examples of this disclosure, as applied to theOS-1/OS-2 xenograft example.

FIG. 15 is a graphical representation of a data analysis technique inaccordance with the examples of this disclosure, as applied to theOS-1/OS-2 xenograft example, indicating predicted upstream regulatorsidentified from the data analysis technique of FIG. 14 .

FIG. 16 is a graphical representation of a data analysis technique inaccordance with the examples of this disclosure, as applied to theOS-1/OS-2 xenograft example, indicating biological processes andcanonical pathways associated with gene clusters identified from thedata analysis technique of FIG. 14 .

FIG. 17 is a graphical representation of a data analysis technique inaccordance with the examples of this disclosure, as applied to theOS-1/OS-2 xenograft example.

FIG. 18 is a graphical representation of a data analysis technique inaccordance with the examples of this disclosure, as applied to theOS-1/OS-2 xenograft example.

FIG. 19 is a graphical representation of a data analysis technique inaccordance with the examples of this disclosure, as applied to theOS-1/OS-2 xenograft example.

FIGS. 20A and 20B are graphical representations of a data gathering andanalysis technique in accordance with the examples of this disclosure,as applied to the OS-1/OS-2 xenograft example

FIGS. 21A-21C are graphical representations of data gathering andanalysis techniques in accordance with the examples of this disclosure,as applied to the OS-1/OS-2 xenograft example.

FIGS. 22-27 illustrate tables providing additional informationpertaining to the application of the techniques described herein to theOS-1/OS-2 xenograft example.

FIGS. 28A-28C are graphical representations of a workflow by which RNAcontents of OS-derived exosomes from cultured cells may be defined usingnext-generation sequencing, and outcomes of example of data analysesperformed on data derived from the workflow indicating that exosomesfrom OS-1 and OS-2 contain transcripts involved in different cellbehaviors.

FIGS. 29A-29C are graphical representations of a workflow by which RNAcontents of OS-derived exosomes from cultured cells may be defined, andoutcomes of example data analyses performed on data derived from theworkflow indicating that decreased expression of cytokines may be foundin fibroblasts treated with OS-2 derived exosomes.

FIGS. 30A and 30B are graphical representations of differentiallyexpressed mouse genes.

FIGS. 31A and 31B are graphical representations of differentiallyexpressed mouse genes and canine orthologs of the mouse genes.

FIG. 32 is a graphical illustration of a bioinformatics method thatshows the number of transcripts at each step of differential expressionanalysis.

FIG. 33A-33C are is graphical representations of 198 differentiallyexpressed transcripts.

FIGS. 34A-34D are graphical representations of the detection ofbiomarkers of disease and host response.

DETAILED DESCRIPTION

Favorable clinical outcomes of many medical conditions depend, tovarying degrees, on factors such as the accurate prediction of apatient's risk of a condition or disease, reliable testing for earlydetection of the condition or disease, accurate prediction of diseaseprogression, or the selection and administration of appropriatetherapies. In some cases, the composition and abundance of nucleic acidspresent in the patient's tissue may be used as one or more biomarkerscorresponding to the patient's risk of the condition or disease, or tothe presence, progression, or potential response of the condition ordisease to a particular therapy. Identification of such biomarkerspresent in the patient's tissue thus may facilitate one or moreinterventions associated with a favorable clinical outcome of thepatient.

At present, diagnostics and interventions based on nucleic-acidbiomarkers largely rely on the identification of single nucleic-acidbiomarkers, which may be identified based on cells derived from tissuesamples (e.g., tumor biopsies) obtained from multiple patients having aparticular disease. Tissue samples from another patient then may beanalyzed to determine whether the biomarker is present in the sample.While some such biomarkers pass predetermined statistical thresholds forbiomarker identification, the use of single biomarkers in diagnosis andtreatment-selection is subject to several drawbacks. For example, in thecase of a particular type of cancer, there may be inherent geneticheterogeneity within tumors in individual patients and among differentpatients with the same type of cancer. Due to such heterogeneity,analysis of a tissue sample from a patient may lead to a false-negativediagnosis that the patient does not have the particular type of cancerbecause the single biomarker was not detected in the tissue sample.

In addition, methods for identifying and detecting biomarkers based onnucleic acids from cells derived from tissue samples are subject totheir own limitations. For example, such methods limit the scope ofinquiry to the tissue samples themselves. Although nucleic-acidbiomarkers identified in tissue samples (e.g., tumor biopsies) mayindicate the presence of a disease or condition, such biomarkers do notreflect aspects of the disease or condition that occur outside of thesampled tissue. For example, while much of a cell's nucleic acids arelocated within the cell, some nucleic acids, such as RNA, can betransported out of the cell inside vesicles called exosomes. Inparticular, cell-free RNA may be found in the bloodstream of animalsinside exosomes.

The RNA contained within exosomes may be indicative of its parent cell'stranscription profile, and may provide biomarkers indicative of apatient's biological state. For example, the RNA contained withinexosomes found within a patient's bloodstream may be indicative of oneor more of the patient's risk of developing a condition or disease, thepresence of the condition or disease, the progress of the condition ordisease, or the potential future progress of the condition or disease.Indeed, some species of RNA contained within exosomes may be indicativeof a metastatic cancer phenotype (i.e., that a cancerous condition islikely to metastasize), and may indicate that a process of metastasis ofa primary tumor has begun. Such species of RNA may be microRNAs, whichmay be packaged into exosomes, secreted from a tumor cell into thebloodstream of an organism, and disseminated to distant tissue sites.Once such microRNAs have infiltrated a distant tissue site, they mayperform a signaling or conditioning function that causes non-cancerouscells at the distant tissue site to undergo changes that make the tissuesite more favorable to colonization by tumor cells.

In some examples, monitoring of the composition and abundance of suchRNA species present within the bloodstream of a patient may reveal aprogressive increase in the abundance of such RNA species, which in turnmay serve as a biomarker of progress of a cancerous condition towardmetastasis. Such findings may inform a clinician's decision to select aparticular type of therapy over another type of therapy, as differenttypes of therapy may be predictable more appropriate at different stagesof the condition or disease.

Thus, a complement of multiple cell-free nucleic acid biomarkers derivedfrom blood may provide a better indicator of the presence, progress andfuture progression of a condition or disease than nucleic acidbiomarkers derived from cells taken from a tissue sample. The collectionof blood samples for the detection and analysis of cell-free nucleicacid biomarkers may be less invasive, costly, and time consuming thantissue biopsies and/or imaging procedures that may be involved in thedetection of nucleic acid biomarkers from tissue samples. Thus,diagnostic and treatment-selection methods based on the detection ofcell-free nucleic acid biomarkers derived from exosomes present in apatient's bloodstream may provide numerous clinical benefits overinterventions based on the detection of a single nucleic acid biomarkerderived from cells.

Described herein are example techniques and system for cell-free nucleicacid biomarker identification that may be used to account for thegenetic heterogeneity of many conditions and diseases, as well asexample methods for detecting such biomarkers within the blood of apatient. Such methods may be used for virtually any disease or conditionfor which there are cell lines that grow as xenografts, or patientderived xenografts that reflect the expected variance in a disease orcondition. One method involves the use of gene cluster expressionsummary scores. These scores account for coordinated transcriptionalregulation of multiple genes, overcoming deficiencies of singlebiomarkers. Another method includes the use of reconstructed hybridgenomes (e.g., mouse host and tumor donor species) and bioinformaticsapproaches to identify mRNAs expressed exclusively in the tumor cells(donor) and mRNAs expressed exclusively in supporting stromal cells(host). Such methods may distinguish mRNAs present in donor-derivedexosomes from mRNAs present in host-derived exosomes, enabling a novelmeans to identify candidate biomarkers. Because the approach is notrestricted to identification of donor-derived exosomes, it can alsomeasure biomarkers of host response as well as biomarkers that candefine response to therapy. The example methods described herein thusmay provide efficient and cost-effective ways to discover biomarkersthat can inform the risk of diseases or conditions, their diagnosis andprognosis, may be used to predict response to therapy, and may berapidly validated in patient samples.

In some methods described herein, identification of cell-free nucleicacid biomarkers may be conducted via xenograft procedures. In suchprocedures, a tissue sample may be obtained from a donor animal of afirst species. Whole tissue or cell lines cultured from the tissuesample may then be grafted into a host animal of another species. Insome examples, the tissue sample or cell lines derived from the donoranimal may be derived from tissue harboring a disease or condition, suchas tissue from a cancerous tumor. In such examples, healthy tissue orcell lines derived from another donor animal may be introduced intoanother host animal as a control. In other examples, the tissue derivedfrom the donor animal may be healthy organ tissue, and it may bedesirable to identify cell-free nucleic acid biomarkers associated witha response of the host tissue and/or the donor organ tissue to a graftprocedure. In either example, blood samples subsequently may be obtainedfrom the host animal, from which exosomes containing RNA may beextracted. The RNA then may be sequenced, quantified, and aligned with ahybrid genome prepared by combining the genome of the donor-animalspecies and the host-animal species. In order to distinguish RNAsequences derived from the donor species from RNA sequences derived fromthe host species, sequences derived from the respective species may beanalyzed separately. In examples in which cell-free nucleic acidbiomarkers of a disease or condition are to be determined, donorsequences resulting from control donor animals also may be compared todonor sequences resulting test donor animals. RNA sequences that aredifferentially expressed in the tissue derived from the control donoranimals and the tissue derived from the test donor animals may beidentified as biomarker sequences for use in disease diagnosis oranalysis of disease behavior. Similarly, RNA sequences that aredifferentially expressed in instances of organ-tissue acceptance orrejection by the host animals may be identified as biomarker sequencesfor use in analysis of organ-transplant feasibility.

In other examples, a biological status associated with a particularbiomarker may be associated with acceptance or rejection of the tissueof the donor animal by the body of a host animal. Example techniquesdescribed herein also may include identifying a predisposition toward aparticular biological state of the donor animal based on the presence ofa biomarker, such as a predisposition toward acceptance or rejection ofdonor tissue. In other examples, such techniques may include identifyingtargeted therapies based on the presence of a biomarker that indicatesthat the body of the host animal is accepting or rejecting the tissue ofthe donor animal, such as in the case of organ tissue transplanted intothe host animal to assess the long-term feasibility of organ transplantfrom the donor animal into a different host animal of yet anotherspecies.

For the sake of illustration, the example techniques described hereinare described within the context of an example in which the host animalis a mouse harboring cells associated with a disease of humans orcompanion animals, such as cells derived from a canine osteosarcomatumor. The xenografts described in the example presented hereinrepresent two molecular phenotypes of osteosarcoma (OS-1 and OS-2) withdistinct biological behavior that are highly conserved between dogs andhumans. However, it should be understood that the example techniques maybe used in the identification and detection of biomarkers associatedwith other diseases or conditions of humans or other animal species.

Osteosarcoma (OS) is a heterogeneous disease with a disproportionatehuman impact, as it mainly affects children and adolescents, and is themost common malignant pediatric tumor of bone. Standard therapy for OScomprises neoadjuvant chemotherapy, surgery and adjuvant chemotherapy.The 5-year survival rates of OS patients with localized and operable OSis 60-70%, but the outcome of patients with non-resectable or metastaticOS is poor, as more than half of patients with OS succumb to metastaticdisease.

OS is also the most common primary malignant tumor of bone in dogs, andit is particularly prevalent in large and giant breeds. OS is anincurable, highly prevalent cancer of large and giant breed dogs thathas been identified as a high priority for health research by over 25AKC Parent breed clubs. In contrast to humans, OS occurs most commonlyin older dogs. The number of diagnoses per year has been estimatedat >8,000, and possibly as high as 80,000 in the US, with the lifetimerisk for OS in some breeds being as high as 1 in 5 to 1 in 7. Similarlyto humans, the outcome of canine patients with metastatic OS is poor.Tumors at the primary site may be managed surgically, but most dogs withOS die from metastasis to lungs or to other bones or organs.

These collective statistics illustrate that progress in managing OS hasbeen hindered by its heterogeneity in both humans and in dogs. Forexample, neither the histological appearance nor the propensity of thetumor cells to elaborate bone, cartilage, or collagen matrices arepredictive of behavior, and while recurrent molecular events have beendescribed, these are yet to be adopted as prognostic or predictivebiomarkers for this disease. Thus, clarification of the etiology of thedisease, development of better strategies to manage disease progression,and methods to guide personalized treatments are among the unmet healthneeds for both human OS patients and canine patients. These needs may bemet by models (e.g., models in species other than humans or canines)that accurately recapitulate the natural heterogeneity of OS in bothhumans and in dogs. Such models may provide a better understanding ofthe events that underlie OS tumor heterogeneity and contribute todisease progression may enable the development of effective strategiesto manage OS and to improve outcomes. In some cases, a single model maybe applicable to both humans and dogs, because spontaneous OS may be ahomologous cellular and molecular disease of humans and dogs. Forexample, prognostically significant gene- and microRNA-expressionsignatures have been discovered that are evolutionarily conserved inhuman and canine OS. Such expression signatures may predict both thebiological behavior OS and patient survival. While not necessarilylinked to metastatic potential, the molecular components of such geneand microRNA expression profiles may reflect tumor growth, invasivepotential, time to metastasis, or patient response to therapy.

Techniques for modeling OS to obtain a better understanding of OSdisease-events in are within the scope of this disclosure and aredescribed in further detail below. Such techniques also may enable abetter understanding of events that occur in other diseases, such asother diseases that may affect dogs, other non-human animals, or humans.In addition, such techniques may enable a better understanding of eventsthat occur in other medical situations, such as in tissue-transplant orother situations.

In the example described below with respect to FIGS. 7A-34D, thetechniques may be illustrated with respect to orthotopic xenografts ofcanine osteosarcoma in nude mice, or with respect to cells cultured withexosomes derived from OS-1 or OS-2 tumors. In this case, potentialbiomarkers for disease include nucleic acids (genes) indicative ofosteosarcoma (canine origin), nucleic acids indicative of biologicalbehavior and/or progression for specific osteosarcomas (canine origin),and nucleic acids indicative of host response to bone invasion, hostresponse to osteosarcoma in general, and response to distinctosteosarcomas with different biological behavior in particular (all ofmurine origin).

In some examples, cells used for xenografts are called OS-1 (OSCA-32)and OS-2 (OSCA-40). Such cells may be derived from canine tumors withdistinct biological behavior and recapitulate this behavior inxenografts. In this example, the cross-species hybrid genome approachmay be used to identify separate canine and mouse sequences from tumorxenografts that inform the progression of disease (in the mouse). Thus,it is possible to use tumor samples grown in mice to determine thecontribution of dog sequences (derived from the implanted, growing tumorcells) and mouse sequences (derived from infiltrating stroma) to definefeatures of progression for tumors arising from implantation of thedifferent cell lines. In the following description, references are madeto illustrative examples. It is understood that other examples may beutilized without departing from the scope of the disclosure.

FIG. 1 is a flow diagram illustrating an example technique according tothis disclosure. At block 102, serum may be isolated from bloodcollected from mice at a “time 0,” i.e., prior to any manipulation. Insome examples, experimental groups may include: mice injectedintra-tibially with PBS (phosphate-buffered saline), with no cells,i.e., control for host response to intratibial injection and possibleconsequent inflammation; mice injected intra-tibially with OS-1 cells;and mice injected intra-tibially with OS-2 cells. In this example, serummay be isolated from blood collected from mice in each group every twoweeks for up to 8 weeks. For each group, there may be two cages of 4mice each. Each cage may be an experimental replicate (blood pooled fromall the mice in the cage to isolate sufficient serum for exosomes;furthermore, blood may be pooled for analysis from weeks 2, 4, 6, and 8for each cage, although aliquots may be preserved from the pool for eachweek for validation by qRT-PCR).

Exosomes may first be isolated from the serum (102). In some examples,this may be accomplished by using ExoQuick kits from System Biosciences,Inc. (SBI), although other suitable techniques may be used.

Next, total RNA may be isolated from the exosomes (104). For example,this may be accomplished by using the Complete SeraMir Exosome RNAAmplification kit from SBI and precipitated with the Dr. GenTLE (GeneTrapping by Liquid Extraction) System from SBI, although other suitabletechniques may be used.

Sequencing libraries may be generated (108) from the RNA by usingNextera XT DNA Library Preparation Kit (Clontech) at the University ofMinnesota Genomics Center (UMGC), although other suitable techniques andfacilities may be used. In some examples, sequencing-library preparationmay include RNA purification, reverse-transcriptase PCR production ofcDNA from the RNA molecules, PCR amplification of the resulting cDNAmolecules, and transcription of the cDNA molecules into RNA. Sequencingmay be done at UMGC on a 50 base-pair paired-end (PE) run on a HiSeq2500 nucleic acid sequencing instrument using Rapid chemistry. In someexamples, it may be desirable to use 8 samples per lane andgenerate >120 M reads, which may be fairly well balanced acrossprojects. Preferably, average quality scores may be above Q30 for all PEreads.

The sequences obtained at block 106 are then compared to a cross-specieshybrid genome is performed, followed by bioinformatic analyses (110). Asummary of example bioinformatics methods for creation and mapping tocross-species hybrid genome and the workflow of data analysis steps withillustrations is described below.

FIG. 2 is a flowchart illustrating an example bioinformatics methodaccording to this disclosure. It will be summarized here with respect toFIG. 2 and described in greater detail below. First, a single hybridreference genome for two species may be created by combining thereference sequences of all chromosomes of each species into one file,with chromosome names modified to indicate the species of origin (202).Next, a single hybrid genome annotation file describing the locations ofgenes in the genome may be created by combining the annotation of eachspecies into one file, with chromosome and gene names modified toindicate the species of origin (204). A sequence alignment program suchas HISAT2 may be used to align RNA-Seq sequence reads to the hybridgenome (206). Most reads will map uniquely to a chromosome of one of thespecies. Some parts of the genomes will be identical in both speciesresulting in a small number of multi-mapped reads mapping to twochromosomes, one from each species, although longer sequence readsreduce the number of multi-mapped reads. The presence and abundancelevels of genes are determined by comparing the genomic location of eachuniquely aligned read with the genomic locations of genes in the hybridannotation file and summing the number of reads aligning to each gene.

Next, multi-mapped genes are excluded from the analysis (208). Excludingmulti-mapped reads from the abundance estimation step may be useful tohelp avoid incorrectly identifying the presence of graft-derived nucleicacids. Aligning RNA-Seq reads only to the reference genome of the graftspecies may result in the spurious identification of graft-derived genesin cases where the genes have identical sequences in both species. Itmay be desirable to compare gene expressions levels from a xenograftsample with a negative control sample way provide further power toreduce false-positives. The identity and abundance of genes originatingfrom the donor animal, which in this example may be a dog, is thendetermined (210). As described in further detail below, the determinedidentity and abundance of genes originating from the donor animal may beused to determine the presence of disease and disease progression, andmay inform treatment decisions.

FIG. 3 is a graphical representation related to the techniques describedwith respect to the flowcharts of FIGS. 1 and 2 . Specifically, FIG. 3provides an overview of the techniques described with respect to theflowcharts of FIGS. 1 and 2 within the context of a dog-to-mousexenograft, where the dog harbors a primary tumor. However, FIG. 3 isillustrative in nature, and provides a broad overview of the techniquesdescribed herein. Other species may be substituted for the dog and mouseillustrated in FIG. 3 , for example, such as other murine species,rodent species, feline, porcine, or non-human primate species. Inaddition, in some examples, the donor organism may be a human. In theexample of FIG. 3 , a first organism of a first species (e.g., a dogdonor-organism) may harbor diseased tissue such as a primary tumor. Aclinician or experimenter may obtain cells from the primary tumor of thedog donor-organism and introduce the cells into a second organism of asecond species (e.g., a mouse host-organism) in a xenograft process.Next, the cells from the primary tumor of the dog donor-organism may beallowed to grow and form a xenograft tumor within the mousehost-organism. Thereafter, the clinician or experimenter may obtain asample of bodily fluid (e.g., blood or blood serum) from the mousehost-organism. Exosomes containing RNA molecules then may be isolatedfrom the sample of bodily fluid, such as by the clinician orexperimenter, or by a suitable instrument, and the RNA moleculescontained within the exosomes may be sequenced (e.g., by next-generationor other sequencing techniques) to determine a sequence forsubstantially each molecule of RNA. Bioinformatics analysis then may beperformed on the RNA sequences. In some examples, the bioinformaticsanalysis may be performed by a system that includes processing circuitry(as described below with respect to FIGS. 4 and 5 ), as well as asuitable receptacle or reservoir configured to receive the sample ofbodily fluid.

The bioinformatics analysis illustrated in FIG. 3 may includedetermining whether each RNA sequence is substantially aligned withexactly one corresponding gene sequence (e.g., a coding or regulatorysequence) of a combined reference genome that includes the genomes ofboth the dog donor-organism and the mouse host-organism, and determiningan approximate number of times that each RNA sequence that aligns withthe combined reference genome occurs in the sample of bodily fluid. Oneor more of the characteristics (e.g., a gene name or a species) of theRNA sequences that align with the combined reference genome or thenumber of times that such RNA sequences occur in the sample of bodilyfluid then may be used to determine whether the sample of bodily fluidcontains a biomarker (e.g., a nucleic acid sequence) associated with adisease status of the dog donor-organism.

In some examples, a disease status may be a predisposition to a disease,the presence of a disease, or the progress or potential progression of adisease of the dog donor-animal. In some examples, the disease statusmay enable a clinician or experimenter to select an appropriatetreatment for the dog donor-organism or the mouse host-organism. Inother examples, instead of a disease status, the bioinformatics analysismay indicate a response of the mouse host-organism to non-diseased cellsor tissue derived from the dog donor-organism, or a response of thenon-diseased cells or tissue derived from the dog donor-organism totransplantation within the mouse host-organism. In such examples, thecharacteristics of the RNA sequences that align with the combinedreference genome or the number of times that such RNA sequences occur inthe sample of bodily fluid then may be used to determine whether thedog-donor organism is a good candidate to receive a tissue transplant(e.g., an organ transplant).

In some examples, the bioinformatics analysis illustrated in FIG. 3 mayinclude a determination that one or more of the RNA sequences thatcorrespond to exactly one gene sequence are associated with apredetermined cluster of genes. Such clusters of genes may be groups ofgenes that share one or more functional characteristics. The functionalcharacteristics of the gene clusters may include one or more biologicalprocesses or canonical pathways, such as one or more of transcriptionalregulation, intracellular signaling, intercellular signaling, cellapoptosis, biomolecule metabolism, biomolecule synthesis, RNAprocessing, or macromolecule assembly. The example technique broadlyillustrated in FIG. 3 will be discussed in greater detail below withrespect to FIGS. 7A-34D.

As illustrated in FIGS. 4 and 5 , various aspects of the techniquesdescribed herein may be implemented within one or more processors,including one or more microprocessors, DSPs, ASICs, FPGAs, or any otherequivalent integrated or discrete logic circuitry, as well as anycombinations of such components, embodied in programmers, such asphysician or patient programmers, electrical stimulators, or otherdevices. The term “processor” or “processing circuitry” may generallyrefer to any of the foregoing logic circuitry, alone or in combinationwith other logic circuitry, or any other equivalent circuitry.

FIGS. 4 and 5 are functional block diagrams of an example system 218configured to perform the techniques described in accordance with thedisclosure. In the example illustrated in FIG. 4 , one or more computingdevices 230A-230N are connected to network 222. In some examples, anexternal server device, such as server device 224, may also be connectedto network 222. The server device 224 shown in FIGS. 3 and 4 may includeprocessing circuitry 228, memory 226, user interface 242, communicationmodule 244, and power source 240. Processing circuitry 228 may includeone or more processors. In one example, processing circuitry 228 isconfigured to run the software instructions in order to controloperation of system 218. Processing circuitry 228 can include one ormore processors, including one or more microprocessors, digital signalprocessors (DSPs), application specific integrated circuits (ASICs),field programmable gate arrays (FPGAs), or any other equivalentintegrated or discrete logic circuitry, as well as any suitablecombination of such components. The term “processor” or “processingcircuitry” may generally refer to any of the foregoing logic circuitry,alone or in combination with other logic circuitry, or any otherequivalent circuitry.

Memory 226 may include any volatile or non-volatile media, such as arandom access memory (RAM), read only memory (ROM), non-volatile RAM(NVRAM), electrically erasable programmable ROM (EEPROM), flash memory,and the like. As mentioned above, memory 226 may store informationincluding instructions for execution by processing circuitry 228 suchas, but not limited to, instructions for performing the techniquesdescribed herein. Communication module 244 may provide one or morechannels for receiving and/or transmitting information. Communicationmodule 244 may be configured to perform wired and/or wirelesscommunication with other devices, such as radio frequencycommunications. In other examples, communication module 244 may not beimplemented, and instead, memory 226 may be removable (e.g., a removableflash memory).

Power source 240 delivers operating power to various components ofcomputing device 218. Power source 240 may generate operational powerfrom an alternating current source (e.g., residential or commercialelectrical power outlet) or direct current source such as a rechargeableor non-rechargeable battery and a power generation circuit to producethe operating power. In other examples, non-rechargeable storage devicesmay be used for a limited period of time.

In one or more examples, the functions described in this disclosure maybe implemented in hardware, software, firmware, or any combinationthereof. If implemented in software, the functions may be stored on, asone or more instructions or code, a computer-readable medium andexecuted by a hardware-based processing unit. Computer-readable mediamay include computer-readable storage media forming a tangible,non-transitory medium. Instructions may be executed by one or moreprocessors, such as one or more DSPs, ASICs, FPGAs, general purposemicroprocessors, or other equivalent integrated or discrete logiccircuitry. Accordingly, the term “processor,” as used herein may referto one or more of any of the foregoing structure or any other structuresuitable for implementation of the techniques described herein.

In addition, in some aspects, the functionality described herein may beprovided within dedicated hardware and/or software modules. Depiction ofdifferent features as modules or units is intended to highlightdifferent functional aspects and does not necessarily imply that suchmodules or units must be realized by separate hardware or softwarecomponents. Rather, functionality associated with one or more modules orunits may be performed by separate hardware or software components, orintegrated within common or separate hardware or software components.Also, the techniques could be fully implemented in one or more circuitsor logic elements. The techniques of this disclosure may be implementedin a wide variety of devices or apparatuses, including an IMD, anexternal programmer, a combination of an IMD and external programmer, anintegrated circuit (IC) or a set of ICs, and/or discrete electricalcircuitry, residing in an IMD and/or external programmer.

Further aspects of the disclosure will now be discussed, includingfurther details of the techniques described herein. The examplelaboratory techniques described herein for accomplishing routinelaboratory tasks, such as the collection of blood and the isolation ofserum from blood, as well as others, are not intended to be limiting andmay be performed by any suitable laboratory techniques. In addition tothe techniques described above, supplementary techniques, as describedbelow, may be employed. The example techniques described herein may beused to identify cell-free transcripts or other nucleic acids in bloodthat are specifically associated with the particular tumor and the hostresponse, thereby creating sets of biomarkers with distinct diagnosticutility. In some examples, the human or animal disease may be re-createdin a mouse (xenograft, xenotransplants), and to recognize that certaincomponents of the response will be absent when immunodeficient mice areused (for example, the host T-cell component in athymic nude mice).

Tumor cells may regularly enter the circulation. Most of these cells diebecause they fail to adapt to conditions of growth outside the tumorniche. However, rare cells survive at distant sites and re-establish aniche that is suitable for tumor growth. It is known that tumors withdifferent metastatic potential (including OS) display distinct patternsof gene expression, which may be derived from tumor cell-autonomousproperties, but may be influenced by the local tumor microenvironment(TME) stratification of OS into prognostically significant groups.Canine and human OS may be stratified into more and less aggressivegroups based on their molecular signatures. The differences in behavioramong these subtypes of OS are intrinsic—that is, they are not entirelydue to therapeutic management. Moreover, while OS generally has highpotential to metastasize, somewhere between 30-40% of children and10%-25% of dogs treated for this disease, respectively, will survivemore than 10 years and more than 2 years. Thus, some tumors metastasizemore slowly than others, and this may true for tumor xenografts inimmunodeficient mice.

The example described herein illustrates that OS-2 xenografts may have agreater propensity to disseminate to the lung than OS-1 xenografts.Highly expressed genes present in stromal cells of OS-2 xenograft tumorsare associated with inflammation and immune response. In addition,pulmonary fibroblasts treated with OS-2 exosomes show decreasedexpression of chemokines that are strongly chemotactic for lymphocytesand that establish an environment favoring polarization to the Th17phenotype. OS-2 derived exosomes contain genes involved in immuneregulation and inflammation. Both OS-1 and OS-2 exosomes may enable there-programming of the tumor micro-environment and establishing afavorable niche for tumor dissemination and metastases.

FIGS. 6A and 6B are graphical representations of differences in geneexpression and patient survival times between OS-1 and OS-2 phenotypes.FIG. 6A is a sample dendrogram indicating that representative samplecanine cell lines OSCA-40 (associated with the OS-2 phenotype) andOSCA-32 (associated with the OS-1 phenotype) were used in the analysisof the example described herein. FIG. 6B is a chart that depicts aKaplan-Meier analysis of overall survival time (ST) of canine patientsaccording to the sample clustering shown in FIG. 6A. As shown in FIG.6B, the median survival time of patients with OS-1 is 13.97 months,whereas the median survival time of patients with OS-2 is 2.83 months.The difference in survival times between the OS-1 and OS-2 phenotypesmay be at least partially attributable to the greater metastaticpotential of OS-2 relative to OS-1. In the example of OS-1 and OS-2, theOSCA-40 cell line may be rapidly metastatic, whereas the OSCA-32 cellline may be poorly or non-metastatic in vivo. Gene expression may varybetween different OS cell lines. For example, miR-20a and miR-135b areexpressed at higher levels in more aggressive cell lines; this patternis conserved in human OS. In order to determine or anticipate thebiological behavior of OS, OS xenografts can be established in the tibiaor tarsus of immunocompromised mice, and the gene expression of the OSxenografts may be analyzed. In addition, such techniques may enabledetermination of biological interactions between the xenograft tissue(e.g., OS-1 or OS-2 cell lines) and the TME within the host tissue.These experiments may be of relatively short duration and painassociated with tumor growth, and may be managed by using medicationand/or amputation as appropriate.

Tumor exosomes package bioactive molecules such mRNAs and microRNAs.Thus, nucleic acids enriched in OS tumors with distinct biologicalbehaviors may be contained in exosomes. Exosomes in canine OS cell lineshave been characterized in situ, by flow cytometry, and by biochemicalanalysis. The data suggest there may be preferential accumulation ofmRNA species in cells derived from tumors with different behavior. Thus,at least some the in vivo behavior of OS may be associated with analtered composition of exosomes that enter the circulation and influencedistant environments in preparation for tumor dissemination.

For this study, cell lines derived from two spontaneous canine OS withdistinctly different biological behavior (OS-1 and OS-2) were used forheterotypic in vivo modeling that recapitulates the heterogeneousbiology and behavior of this disease. Both cell lines demonstratedstability of the transcriptome when grown as orthotopic xenografts inathymic nude mice. Consistent with the behavior of the original tumors,OS-2 xenografts grew more rapidly at the primary site and had greaterpropensity to disseminate to lung and establish microscopic metastasis.Moreover, OS-2 promoted formation of a different tumor-associatedstromal environment than OS-1 xenografts. In addition to comprising alarger fraction of the tumors, a robust pro-inflammatory populationdominated the stromal cell infiltrates in OS-2 xenografts, while amesenchymal population with a gene signature reflecting myogenicsignaling dominated those in the OS-1 xenografts. The studies describedherein show that canine OS cell lines maintain intrinsic features of thetumors from which they were derived and recapitulate the heterogeneousbiology and behavior of bone cancer in mouse models. This systemprovides a resource to understand interactions between tumor cells andthe stromal environment that may drive progression and metastaticpropensity of OS.

Understanding the heterogeneous biology and behavior of OS may be usefulto fully elucidate the pathogenesis of osteosarcoma and other diseases.As described herein, orthotopic canine OS xenografts preserve thebiological, molecular, and heterotypic heterogeneity observed in thetumors from which they were derived. Moreover, transcriptome analysis ofxenograft tumors revealed a strong OS cell specific stromal response,which may provide evidence that intrinsic genetic tumor characteristicsand cross-talk between tumor and stromal cells might underlieheterogeneity of biological behavior in OS patients. These data mayprovide insight into tumor-host interactions and identify targets thatmay play a role in treatment strategies for OS patients.

Results

FIGS. 7A and 7B are photographic representations of data pertaining tothe application of the techniques described herein to the OS-1/OS-2xenograft example. FIG. 7A includes photographs of mice injected withOS-1 and OS-2 cells at day 1 post-injection and again at day 8post-injection. FIG. 7B includes photographs of the mice injected withOS-1 and OS-2 cells at days 8, 15, and 49 post-injection. Luminescenceresulting from luciferase activity is shown in radiance (p/sec/cm²/sr).Differential metastatic propensity in orthotopic canine OS-1 and OS-2xenografts: Luciferase activity was observed in the lungs of micereceiving intratibial OS-2 cells, but not in mice injected with OS-1cells, within 6 hours of injections (FIG. 7A). This was interpreted asevidence of systemic dissemination of OS-2 cells with accumulation inthe lungs. The luciferase signal disappeared from the lungs within oneweek after tumor administration, but the presence of OS-2 cells wasevident focally in the lungs of one mouse from this group again withintwo weeks after tumor administration, and the luciferase activity inthis area continued to increase until the end of the experiment (FIG.7B). When the mice from all the experiments were considered together,OS-2 cells achieved metastatic dissemination more rapidly than OS-1cells (by 15, 22, and 29 days), although the rate of microscopic andmacroscopic metastasis between the two groups on Day 36 when theexperiments were terminated was not different based on imaging (p=0.35)or histopathology (p=0.77; see also the table illustrated in FIG. 22 ).

FIGS. 8A-8C are photographic and graphical representations of datapertaining to the application of the techniques described herein to theOS-1/OS-2 xenograft example, and indicate differential growth rates atthe primary site in orthotopic canine OS-1 and OS-2 xenografts. FIG. 8Aincludes photographs of mice injected with OS-1 and OS-2 cells at days1, 29, and 57 post-injection. Luminescence resulting from luciferaseactivity is shown in radiance (p/sec/cm²/sr). FIG. 8B illustrates invivo luciferase activity at different times from 1-59 dayspost-injection. FIG. 8C illustrates disease progression over time asindicated by tibia volume. Development and progression of primary tumorswere examined using in vivo imaging starting six hours after orthotopiccell injections and then weekly for the duration of the study.Luciferase activity was detectable within 6 hours in many of the micereceiving OS-1 or OS-2 cells, and all of the mice showed diseaseprogression over time. Expansion of tumor cells may be inferred from theincreased luciferase emission over time. FIG. 8B shows that OS-2intratibial xenografts had grown significantly faster than OS-1intratibial xenografts by day 22 and this difference persisted until day50. The results shown in FIG. 8C encompass a more complex process, asthe physical size of the tumors in the proximal tibia would beinfluenced by infiltrating host stromal cells and swelling. The dataconfirm that OS-2 intratibial xenografts grew significantly faster thanOS-1 intratibial xenografts in this example, although the effect wasdelayed (detectable by day 29), with this relative difference persistinguntil day 50 (FIGS. 8B and 8C; see also the table illustrated in FIG. 22). It is worth noting that neither indirect imaging measurements nordirect physical measurements may necessarily account for tumor invasionand loss of periosteal integrity, as is described below. Nevertheless,the data shown in FIGS. 8A-8C and FIG. 23 indicate that, in thisexample, disease progression was significantly faster in animalsharboring OS-2 xenografts than in animals harboring OS-1 xenografts.

FIGS. 9A-9J are graphical representations of data pertaining to theapplication of the techniques described herein to the OS-1/OS-2xenograft example. Primary and metastatic tumors derived from orthotopicimplantation of OS-1 and OS-2 cells show histological features andorganization that are characteristic of canine OS: All of the miceinjected with OS-1 or OS-2 cells had evidence of gross tumor burden inthe proximal tibia at necropsy on the eighth week after injection (FIGS.9A and 9B). Histologically, OS-1-derived tumor xenografts werecharacterized by a relatively well-differentiated, polygonal tospindle-shaped cells that had round to oval nuclei, mild to moderateanisocytosis and anisokaryosis, and infrequent mitotic activity (FIGS.9C and 9E). These tumors contained organized osteoid ribbons and showedlimited destruction of cortical bone and epiphyseal invasion (FIG. 9C).

In contrast, OS-2 tumors had a more aggressive appearance withspindle-shaped, anaplastic cells that had round to elongate nuclei,moderate anisocytosis and anisokaryosis, and frequent mitotic activity(FIGS. 9D and 9F). The cells in these tumors were embedded in a poorlyorganized, pale eosinophilic matrix and they showed extensive necrosiswith marked destruction of cortical bone and epiphyseal invasion (FIG.9D).

The different metastatic propensity of OS-1 and OS-2 was confirmedhistologically (FIGS. 9G-9J; also see the table illustrated in FIG. 22). Fewer than 20% of the mice injected with OS-2 and 7% of the miceinjected with OS-1 developed metastasis by Day 36. When lung metastasiswas present, the histological appearance of the metastatic tumorsrecapitulated that of the parent tumors as illustrated by thephotomicrographs on one mouse receiving OS-2 orthotopically in FIGS. 9Hand 9J. In these animals, the morphology and mitotic activity of thecells and their residence in a poorly organized, pale eosinophilicmatrix with extensive areas of necrosis and frequent mitotic activitywere comparable to that seen in the primary tumors.

FIGS. 10A-19 illustrate bioinformatics methods that may be used to carryout one or more portions of the methods described herein as applied tothe OS-1/OS-2 example. Generally, FIGS. 10A-19 illustrate that a singlehybrid reference genome for two species is created by combining thereference sequences of all chromosomes of each species into one file,with chromosome names modified to indicate the species of origin. Asingle hybrid genome annotation file describing the locations of genesin the genome is created by combining the annotation of each speciesinto one file, with chromosome and gene names modified to indicate thespecies of origin. A sequence alignment program such as HISAT2 is usedto align RNA-Seq sequence reads to the hybrid genome. Most reads may mapuniquely to a chromosome of one of the species. Some parts of thegenomes may be identical in both species resulting in a small number ofmulti-mapped reads mapping to two chromosomes, one from each species,although longer sequence reads reduce the number of multi-mapped reads.The presence and abundance levels of genes may be determined bycomparing the genomic location of each uniquely aligned read with thegenomic locations of genes in the hybrid annotation file and summing thenumber of reads aligning to each gene. Excluding multi-mapped reads fromthis abundance estimation step may help avoid incorrectly identifyingthe presence of graft-derived nucleic acids. Aligning RNA-Seq reads onlyto the reference genome of the graft species will result in the spuriousidentification of graft-derived genes in cases where the genes haveidentical sequences in both species. Comparing gene expressions levelsfrom a xenograft sample with a negative control sample provides furtherpower to reduce false-positives.

FIG. 10A is a graphical representation of a data analysis technique inaccordance with the examples of this disclosure, as applied to theOS-1/OS-2 xenograft example, and illustrates a portion of thebioinformatics workflow analysis in accordance with an example techniqueof this disclosure. As shown in FIG. 10A, Phred scores may be used for aquality control check to ensure that the raw data fall within acceptableparameters and that there are no problems or biases in the data. In FIG.10A, a plot of mean quality values (Phred scrore) across all bases ateach position in the sequence read is prepared. A Phred score >28indicate good calling performance. Greyscale shading indicates quality:very good quality calls (246), and calls of reasonable quality (248).

FIG. 10B is a graphical representation of a data analysis technique inaccordance with the examples of this disclosure, as applied to theOS-1/OS-2 xenograft example, and further illustrates a portion of thebioinformatics workflow analysis in accordance with an example techniqueof this disclosure. As shown in FIG. 10B, a percent of sequences alignedto the cross-species hybrid genome, such as by using a HISAT2 aligner toalign the RNA sequences to cross-species hybrid genome, is determined.

FIG. 11 is a graphical representation of a data analysis technique inaccordance with the examples of this disclosure, as applied to theOS-1/OS-2 xenograft example, and further illustrates a portion of thebioinformatics workflow analysis in accordance with an example techniqueof this disclosure. As shown in FIG. 11 , gene abundance values may begenerated and graphically rendered. The presence and abundance levels ofgenes are determined by comparing the genomic location of each uniquelyaligned read with the genomic locations of genes in the hybridannotation file and summing the number of reads aligning to each gene.For this analysis raw counts may be generated by a feature countssummarization program as the abundance value. Such a program does notcount reads overlapping with more than one genomic region. FIG. 11illustrates the violin plots that show the distributions of geneabundance values, although other types of plots also may be used toillustrate gene abundance values.

FIG. 12 is a graphical representation of a data analysis technique inaccordance with the examples of this disclosure, as applied to theOS-1/OS-2 xenograft example, and further illustrates a portion of thebioinformatics workflow analysis in accordance with an example techniqueof this disclosure. Specifically, FIG. 12 is plot from multidimensionalscaling of gene abundance values is examined to explore relationshipsbetween samples and to identify and potential outliers, and illustratesmulti-dimensional scaled plot for samples in this experiment based ongene abundance values illustrated in FIG. 11 . Note scale has beenshrunken to appreciate separation. Samples from controls and micewithout tumors form a tight cluster, with some separation along thefirst dimension for tumor samples, and separation along the seconddimension between the tumor samples.

FIG. 13 is a graphical representation of a data analysis technique inaccordance with the examples of this disclosure, as applied to theOS-1/OS-2 xenograft example, and further illustrates a portion of thebioinformatics workflow analysis in accordance with an example techniqueof this disclosure. Specifically, FIG. 13 is a hierarchical clusteringheat map illustrating that samples that are most similar occupy closerpositions in tree, while samples that are less similar are separated bylarger numbers of branch points. Hierarchical clustering heatmaps, suchas the heatmap of FIG. 13 , may be generated after converting geneabundance values to z-scores. A color or greyscale scheme may be appliedfor the visualization of “high” and “low” gene abundance values in thesamples, as shown in FIG. 13 . For the sake of illustration, an areaindicating low z-scores is depicted at 250, and an area of high z-scoresat 252. The rows of the heatmap identify the names of the genesrepresented in the heatmap. In the heatmap of FIG. 13 , the dog genesare identifiable by having an Ensembl gene name (e.g., ENSCAFG . . .ID), while the mouse genes are identifiable by murine gene symbol.OSCA-40 replicate samples cluster together away from OSCA-32 and thecontrol samples indicating that they are more similar to one anotherthan to other samples.

FIG. 14 is a graphical representation of a data analysis technique inaccordance with the examples of this disclosure, as applied to theOS-1/OS-2 xenograft example, and further illustrates a portion of thebioinformatics workflow analysis in accordance with an example techniqueof this disclosure. As illustrated in FIG. 14 , identification of canineand murine may performed to identify genes that are differentiallyabundant in: controls versus xenograft samples (canine specificsequences), and OS-1 versus OS-2 xenograft samples. FIG. 14 illustratesa hierarchical clustering heatmap of log transformed and mean centeredgene abundance values. The heatmap represents clustered gene-levelcounts with lower than mean (values of −3.00-−1.00; area 254), higherthan the mean (values of 1.00-3.00; areas 256 and 258), and mean (valueof 0.00) levels of expression. Each row of the heat map represents asingle gene. As shown in FIG. 14 , there are a number of highlycorrelated genes that are abundant in xenograft samples, OS-1 (OSCA-32)and OS-2 (OSCA-40) compared to control samples (denoted as Cluster 1).These 1078 genes all had canine genes IDs. There are a number of highlycorrelated genes that are more abundant in OS-2 xenografts compared toOS-1 xenograft samples (denoted as Cluster 4). In the exampleillustrated in FIG. 14 , such genes all had canine gene IDs.

FIG. 15 is a graphical representation of a data analysis technique inaccordance with the examples of this disclosure, as applied to theOS-1/OS-2 xenograft example, and further illustrates a portion of thebioinformatics workflow analysis in accordance with an example techniqueof this disclosure, including the determination of biological processesand canonical pathways associated with gene clusters identified from thehierarchical clustering heatmap of FIG. 14 . FIG. 15 depicts the heatmapof in FIG. 14 overlaid with the outcome of a technique that includespredicting upstream regulators for the seven gene clusters that wereidentified from hierarchical clustering heatmap by the data analysistechnique of FIG. 14 . FIG. 15 illustrates that Cluster 7 consists ofexclusively mouse genes.

FIG. 16 is a graphical representation of a data analysis technique inaccordance with the examples of this disclosure, as applied to theOS-1/OS-2 xenograft example, and further illustrates a portion of thebioinformatics workflow analysis in accordance with an example techniqueof this disclosure. As illustrated in FIG. 16 , biological processes andcanonical pathways associated with gene clusters identified from thehierarchical clustering heatmap of FIGS. 14 and 15 may be determined.The heat map pictured in FIG. 16 is the same as shown in FIGS. 14 and 15. Top predicted transcriptional regulators of genes for each genecluster defined by hierarchical clustering are shown in the heatmap.FIG. 16 illustrates that Cluster 7 consists of exclusively mouse genes.ErbB-1 also is named epidermal growth factor receptor (EGFR), and ErbB-2is also named HER2 in humans and neu in rodents.

FIG. 17 is a graphical representation of a data analysis technique inaccordance with the examples of this disclosure, as applied to theOS-1/OS-2 xenograft example, and further illustrates a portion of thebioinformatics workflow analysis in accordance with an example techniqueof this disclosure. FIG. 17 illustrates a determination of canine andmurine genes that have statistically different abundances andidentification of predicted upstream regulators of these genes withrespect to (1) controls versus xenograft samples, and (2) OS-1 xenograftsamples versus OS-2 xenograft samples. As illustrated in FIG. 17 ,predicted upstream regulators for 125 statistically different mouse(host) genes between controls and xenograft samples were identifiedusing the data analysis technique of FIG. 17 .

FIG. 18 is a graphical representation of a data analysis technique inaccordance with the examples of this disclosure, as applied to theOS-1/OS-2 xenograft example, and further illustrates a portion of thebioinformatics workflow analysis in accordance with an example techniqueof this disclosure. At FIG. 18 , predicted upstream regulators for 325statistically (p<0.005) different mouse (host) genes between OS-1 andOS-2 xenograft samples are determined. Predicted activity is in OS-2with respect to OS-1 xenograft samples.

FIG. 19 is a graphical representation of a data analysis technique inaccordance with the examples of this disclosure, as applied to theOS-1/OS-2 xenograft example, and further illustrates a portion of thebioinformatics workflow analysis in accordance with an example techniqueof this disclosure. FIG. 19 illustrates predicted upstream regulators of530 statistically (p<0.05) different canine genes between OS-1 and OS-2xenograft samples. Predicted activity is in OS-2 with respect to OS-1xenograft samples. In some examples, canine genes may validatepreviously reported data indicating differences between tumors arisingfrom OS-1 and OS-2, although canine genes isolated from blood may bedifferent from genes obtained from tissue (e.g., from tumor tissue).

FIGS. 20A and 20B are graphical representations of a data gathering andanalysis technique in accordance with the examples of this disclosure.FIG. 20A illustrates that gene signatures of tumor cells in OSxenografts resemble those of parent cell lines. FIG. 20B illustrateshierarchical clustering of tumor xenografts and parent cell lines withcanine and murine genes. One obstacle to using xenograft models tounderstand the heterogeneity of genetically complex tumors is thepresumption that these tumors are unstable and will drift rapidly asthey adapt to the host microenvironment. Indeed, previous data suggestthat altered genomic signatures due to tumor cell plasticity and/orharsh clonal selection lead to unpredictable behavior of tumor celllines after being transplanted into mice. Here, RNA sequencing was usedto examine the stability of key transcriptomic properties between theparental OS cell lines and their corresponding tumor xenografts. Thetumor xenografts were more similar to their corresponding parent celllines than to each other or to the alternative cell line based onprincipal components analysis and by unsupervised clustering (FIG. 20A),where tumor xenografts were assigned to the same group as theircorresponding parent cell line based on the expression signatures fromcanine genes.

As shown in FIG. 20A, gene signatures of parent tumor cell linesmaintained in OS-1 and OS-2 xenograft tumors. 24,579 total canine geneswere filtered to remove genes that did not have a log 2 counts permillion (CPM) mean-centered value ≥1 in at least two samples. 13,141genes remained after filtering. The heatmap represents clusteredgene-level counts with lower than mean (values of −2.00-˜−0.10; e.g.,areas 266 in FIG. 20B), higher than the mean (˜0.10-2.00; e.g., area 267in FIG. 20B), and mean (0.00) levels of expression. Each row representsa single gene. The dendrogram represents the distance or dissimilaritybetween sample clusters, calculated using unsupervised hierarchicalclustering on CPM values for the 13,141 filtered genes. In thisdendrogram, there are two sample clusters as two branches that occur atabout the same vertical distance. One of the sample clusters consists offour OS-1 (260) xenograft tumors (261) and two parental cell linereplicates (262), and one of these clusters consists of four OS-2 (263)xenograft tumors (264) and two parental cell line replicates (265). Allreplicates are biological replicates.

When dog and mouse genes were analyzed together, expression ofmouse-specific genes was not detected in the canine cell lines (FIG.20B), indicating that the mouse genes present in the tumor xenografttissues could be accurately differentiated from the dog genes using thecomparative bioinformatics approach. Furthermore, significantly largernumbers of mouse genes were detectable in OS-2 than in OS-1 xenografts,suggesting the former tumors were more heavily infiltrated by hoststroma (FIG. 20B).

FIG. 20B illustrates log-transformed and mean-centered counts permillion (CPM) values for 47,997 canine and murine genes in xenograft andparental cell line samples that were filtered to remove genes that didnot have a log 2 CPM mean-centered value ≥3 in at least two samples.This filtering step excluded most of the canine genes while includingthe murine genes. Unsupervised hierarchical clustering with counts permillion values for the remaining 13,968 canine and murine genesindicated that expression levels of murine genes in canine cell lineswas absent in comparison to the tumor xenografts, illustrating thatmurine genes can be properly differentiated from the canine genes inxenografts.

FIGS. 21A-21C are graphical representations of data gathering andanalysis techniques in accordance with the examples of this disclosure,and illustrate that OS-1 and OS-2 xenografts may promote distincttumor-associated stromal environments. To determine the nature of thestromal interactions and the identity of the infiltrating cells in thexenografts, pair-wise Exact Test comparisons were performed, with TMMnormalization of gene counts, to identify the differentially expressedmurine genes in tumors from each group (OS-1 and OS-2). Four biologicalreplicates were used for each OS subtype. Common dispersion across allgenes was calculated as 0.079 and the biological coefficient ofvariation (BCV) as 0.23. Mean tag-wise dispersion (individual dispersionfor each gene) was calculated as 0.095. Using a false discovery rate(FDR)-adjusted p-value of <0.005 and log₂ fold change >2,482 genes wereidentified that were expressed at significantly different levels betweenthe two groups (FIG. 21A; Table S2). After identifying differentiallyexpressed genes (DEG), log transformed and mean-centered counts permillion (CPM) values for 47,997 canine and murine genes were generated.The Pearson distance similarity metric and average linkage clusteringmethod was used for hierarchical clustering of log₂ CPM values for the482 differentially expressed murine genes (see table S2 below fordetailed gene lists).

The heatmap in FIG. 21A shows clustered gene-level counts with lowerthan mean (negative values; e.g., area 270), higher than the mean(positive values; e.g., area 272), and mean (value=zero) levels ofexpression. Each row represents a single gene. The dendrogram of thehorizontal axis of the heat map shows two sample clusters; OS-1 and OS-2xenografts are in separate sample groups (FIG. 21A). The rows of theheat map (vertical axis) cluster into two highly correlated groups. Rowscorresponding to a positive value in the vertical dendrogram are murinegenes that are upregulated in OS-2 xenografts (e.g., a majority of therows in region 270), whereas rows corresponding to a negative value aredownrgulated relative to OS-1 xenografts (e.g., a majority of the rowsin region 272). Enriched pathway and functional classification analysesof DEGs were performed using QIAGEN's Ingenuity® Pathway Analysis (IPA®,QIAGEN Redwood City, www.qiagen.com/ingenuity) according to row clusterdesignation. Upregulated genes are identified in FIG. 21B, anddownregulated genes are identified in FIG. 21C.

To better understand the differences between OS-1 and OS-2, theupregulated and downregulated murine genes in OS-2 were considered asseparate lists and used IPA to identify enriched biological functionsand transcription factors that regulate these genes. The 482differentially expressed murine genes included 240 that wereupregulated, and 242 that were downregulated in OS-2 tumor xenograftsrelative to OS-1 tumor xenografts. The most upregulated murine gene inthe OS-2 xenografts was Mcpt1 (+11.25 fold), whereas the mostdownregulated murine gene was Nkx2-1 (−10.97 fold) (see table S2, whichis shown below at paragraph [0140]).

Based on biological function and processes, the most upregulated murinegenes in OS-2 tumors were proteases, metallopeptidases, cytokines, andchemokines involved in cell movement, leukocyte migration, inflammation,and angiogenesis (FIG. 21C, Table S2). On the other hand, the mostdownregulated genes in OS-2 tumor xenografts were transcriptionalregulators of cellular differentiation and cell cycle involved information and morphology of muscle (FIG. 21C, Table S2).

FIGS. 22-27 are tables providing additional information pertaining tothe application of the techniques described herein to the OSC1/OSC2xenograft example. For example, the tables illustrated in FIGS. 22-27provide information pertaining to the metastatic rates, rates of tumorprogression, and pathways for differentially expressed murine genes andtheir upstream regulators for this particular example.

FIG. 22 is a table that illustrates the metastatic properties of theOS-1 xenografts as compared to the OS-2 xenografts at 15-57 dayspost-injection. When the mice from all the experiments were consideredtogether, OS-2 cells achieved metastatic dissemination more rapidly thanOS-1 cells (by 15, 22, and 29 days), although the rate of microscopicand macroscopic metastasis between the two groups on Day 36 when theexperiments were terminated was not different based on imaging (p=0.35)or histopathology (p=0.77). The different metastatic propensity of OS-1and OS-2 was confirmed histologically as illustrated in FIG. 9 .

FIG. 23 is a table that illustrates that the OS-1 and OS-2 xenograftsshow differential rates of tumor progression. As shown in FIG. 23 , theprogress of OS-2 xenograft tumors (as determined by measured in changein tumor volume) was significantly more rapid than the progress of theOS-1 xenograft tumors from 22-43 days post-injection.

FIG. 24 is a table that illustrates a MetaCore analysis identifyingpathways for murine genes that are differentially expressed between OS-1and OS-2 xenograft tumors. The top 10 most enriched pathways, shown inFIG. 24 , suggest immune and inflammatory themes that modulate IL-17,TGF-beta signaling, the complement system, and patterning behavior andcytoskeletal remodeling with involvement of Rho GTPases. Analysis of the482 murine genes identified as differentially expressed between OS-1 andxenograft tumors was performed with MetaCore software(https://portal.genego.com/) to show the top 10 processes and pathwaysranked in terms of the enrichment of the common target-related genes(p-value).

FIG. 25 is a table identifying upstream regulators ofdifferentially-expressed murine genes in OS-2 xenograft tumors ascompared to OS-1 xenograft tumors. These upstream regulators of the 482differentially expressed murine genes were observed using IPA. The mostsignificant, predicted activated upstream regulators in OS-2 (worseprognosis), relative to OS-1 tumor xenografts were CEBPB and NFKB1(p-value 5.54E-10 and 3.94E-09, respectively), whereas the mostsignificant, predicted inhibited upstream regulator was MEF2C (p-value2.54E-23) (FIG. 25 ). The retinoblastoma tumor suppressor gene (RB1) wasalso among the predicted significant upstream regulators (p-value1.25E-04) showing inactivation in OS-2 xenograft tumors, as otherwisemay have predicted based on previous studies. The differentiallyexpressed murine genes were determined by pair-wise Exact Testcomparisons in EdgeR. IPA was used to determine upstream modulators ofthe 482 differentially expressed genes and their predicted activities.Predicted activity based on gene expression values in OS-2 xenograftsrelative to OS-1 xenografts. A Z-score >2 indicates activation, while aZ-score <−2 indicates inactivation.

FIG. 26 is a table identifying upstream regulators of the upregulatedmurine genes in OS-2. The upstream regulators predicted to modulateexpression and activity of the 240 upregulated murine genes expressed inthe OS-2 tumor xenografts included the T-helper cell type-17 (Th17)activating cytokines TGF-β (p-value 1.26E-27), IL-1B (p-value 9.07E-25),and IL-6 (p-value 9.03E-22). Differentially expressed genes weredetermined by pair-wise Exact Test comparisons in EdgeR. IPA was useddetermined upstream modulators of the 240-upregulated murine genes inOS-2 xenografts. Predicted activity based on gene expression values inOS-2 xenografts. A Z-score >2 indicates activation, while a Z-score <−2indicates inactivation.

FIG. 27 provides a table indicating upstream regulators of downregulatedmurine genes in OS-2 xenografts relative to OS-1 xenografts. The topupstream regulators predicted to modulate expression and activity of the244 downregulated murine genes in the OS-2 xenografts were MEF2C andMYOD1 (p-value 1.15E-24 and 2.16E-15, respectively). MEF2C and MYOD1,both predicted as being inhibited in OS-2 xenografts and activated inOS-1 tumors, are important in promoting transcription of muscle-specifictarget genes and play a role in muscle differentiation. Differentiallyexpressed genes were determined by pair-wise Exact Test comparisons inEdgeR. IPA was used determined upstream modulators of the242-downregulated murine genes in OS-2 xenografts. Predicted activitybased on gene expression values in OS-2 xenografts. A Z-score >2indicates activation, while a Z-score <−2 indicates inactivation.

FIGS. 28A-28C are graphical representations of a workflow by which RNAcontents of OS-derived exosomes from cultured cells may be defined usingnext-generation sequencing, and outcomes of example of data analysesperformed on data derived from the workflow indicating that exosomesfrom OS-1 and OS-2 contain transcripts involved in different cellbehaviors. FIG. 28A illustrates that an example workflow may includeestablishing the presence of exosomes in OS cultured cells, purifyingthe exosomes from OS cell culture and validating the purification,isolating RNA molecules, sequencing the RNA molecules, and thenperforming bioinformatics analysis on the resulting sequences todetermine the identity of the RNA molecules, as well as othercharacteristics of the RNA molecules. For example, as shown in FIGS. 28Band 28C, other characteristics of the RNA molecules that may bedetermined by the bioinformatics analysis of FIG. 28A may includeidentifying pathways and cell behaviors associated with genescorresponding to the sequenced RNA molecules. For example, the heatmapand table of FIG. 28B indicate that OS-1 derived exosomes may containRNA associated with genes involved in cellular signaling and metabolism,whereas the heatmap and table of FIG. 28C indicate that OS-2 derivedexosomes may contain RNA associated with genes involved in communicationbetween innate and adaptive immune cells.

FIGS. 29A-29C are graphical representations of a workflow by which RNAcontents of OS-derived exosomes from cultured cells may be defined, andoutcomes of example data analyses performed on data derived from theworkflow indicating that decreased expression of cytokines may be foundin fibroblasts treated with OS-2 derived exosomes. FIG. 29A illustratesthat an example workflow may include culturing a test group offibroblasts with exosomes derived from OS cells (with a phenotype ofeither OS-1 or OS-2), and culturing a control group of fibroblastswithout exosomes derived from OS cells. The workflow may further includeisolating RNA from the groups of fibroblasts, sequencing the RNAmolecules, and then performing bioinformatics analysis on the resultingsequences to determine the identity of the RNA molecules and differencesin gene expression between the test fibroblasts that were cultured withOS exosomes and the control fibroblasts that were cultured without OSexosomes. FIG. 29B is a table indicating differentially-expressed genesidentified in fibroblasts that were cultured with OS-1 exosomes, ascompared to fibroblasts cultured with OS-2 exosomes and/or as comparedto fibroblasts cultured without OS exosomes. FIG. 29C is a tableindicating differentially-expressed genes identified in fibroblasts thatwere cultured with OS-2 exosomes, as compared to fibroblasts culturedwith OS-1 exosomes and/or as compared to fibroblasts cultured without OSexosomes. In the illustrated example of FIGS. 29A-29C, the RNA sequencesderived from the fibroblasts cultured with OS-2 exosomes indicated thatsuch fibroblasts had decreased expression of cytokines as compared tofibroblasts cultured with OS-1 exosomes and/or as compared tofibroblasts cultured without OS exosomes.

FIGS. 30A and 30B are graphical representations of differentiallyexpressed mouse genes. The heatmaps depicted in FIGS. 30A and 30Billustrate top differentially expressed genes (mouse) identified byMetaCore Analysis, and indicates that different platforms and differentmethods of measuring gene expression and analyzing resulting data mayproduce consistent results.

FIGS. 31A and 31B are graphical representations of differentiallyexpressed mouse genes and canine orthologs of the mouse genes. Thedifferentially-expressed mouse genes are depicted by the heatmap in FIG.31A, and the canine orthologs of the differentially-expressed mousegenes are depicted in the same order by the heatmap in FIG. 31B. FIGS.31A and 31B illustrate that the differential expression of mousetranscripts (e.g., mRNA profiles) in exosomes isolated from miceharboring OS-1 or OS-2 xenografts was not due to spurious mapping ofcanine genes to the mouse genome.

FIG. 32 is a graphical illustration of a bioinformatics method thatshows the number of transcripts at each step of differential expressionanalysis. The ‘DESeq2’ package in RStudio (version 0.99.491) was usedfor differential analysis of transcript counts obtained from Kallistodata. Transcript counts were first summarized to gene counts and thenDESeq2 was used to convert count values to integer mode, correct forlibrary size, and estimate dispersions and log 2 fold changes betweencomparison groups. Genes with a BH adjusted p-value <0.05 and log 2 foldchange >+/−2 between control and xenograft samples were calledsignificant. Statistically differentially expressed canine genes wereremoved if they had a DeSeq2 normalized value of greater than zero inthe control (mouse sequences), as these may be genes that are highlyhomologous between the mouse and canine).

FIG. 33A-33C are is graphical representations of 198 differentiallyexpressed transcripts. FIG. 33A is a heatmap representing all 198differentially expressed transcripts. FIGS. 33B and 33C are close-upportions of the heatmap of FIG. 33A that respectively represent mouse-and dog-specific transcripts.

FIGS. 34A-34D are graphical representations of the detection ofbiomarkers of disease and host response. FIG. 34A is a heatmap of 25most differentially expressed dog transcripts identified by statisticalwith DESeq2, which may be validated in dog patients with spontaneousosteosarcoma, and incudes a graphical representation of a cycle in whichexosomes from dogs with osteosarcoma and healthy dogs is used forin-species validation of transcripts identified from serum exosomes ofxenograft models. FIG. 34B is a heatmap of 38 differentially expressedmouse genes. FIG. 34C illustrates significant pathways, and FIG. 34Dillustrates a top network (cell cycle) identified by IPA as beingassociated with differentially expressed host (mouse) genes shown inFIG. 34B.

In the example described herein, mouse xenografts were used to study theheterogeneity and biological behavior of OS in vivo. Specifically, thisapproach creates opportunities to examine tumor-intrinsic properties, aswell as organotypic, tumor-stromal interactions that influence tumorprogression. Cells were injected at the orthotopic site to simulate thebiology of the spontaneous disease. The anatomical site of implantationmay be considered carefully, as the biological behavior of tumors isdependent on both the intrinsic properties of tumor cells and hostfactors that differ between tissues and organs. The microenvironment insubcutaneous xenografts consists of desmoplastic mouse stromal cellsthat do not resemble the organization seen in autochthonous tumors.These properties also apply to OS: orthotopic canine OS xenografts innude mice produced osteoid matrix and metastasized spontaneously, whilesubcutaneous xenografts did not.

The data show that heterogeneity of biological behavior (includingmetastatic propensity) can be recapitulated to a limited extent intumors from cell lines, but more readily by utilizing multiple celllines that cover the spectrum of tumor behavior. Further, the data showthat the major genetic drivers that distinguish the two canine OS celllines in vitro were retained in the orthotopic xenografts. In additionto stability of the transcriptome, the cell lines show stable morphologyfrom the primary canine tumors to the primary orthotopic tumors, and tothe metastatic tumors. Confirmation of genetic and morphologic stabilityover many passages was validated the utility of the present model tounderstand OS tumor heterogeneity.

As predicted from the original behavior of the spontaneous tumors in thedogs and from their gene and microRNA expression signatures, thelogarithmic expansion phase of OS-2 primary xenografts was faster thanthat of OS-1 primary xenografts. However, both cell lines seemed toreach the tumor endpoints at approximately the same time. Two factorsmight account for this. First, the tumors are growing within a cavitysurrounded by bone, and despite the fact that OS-2 xenografts showedgreater epiphyseal destruction and invasion, the bone constrains themaximum size achievable by the primary tumors within the experimentaltime frame. Second, mice with OS-2 xenografts did not show greatermorbidity than mice with OS-1 xenografts, determined by the absence oflameness, ambulatory deficits, and other behaviors associated withchronic pain. This could be due to adaptive behavior of prey species tohide pain; however, previous work has shown that painful intramedullarybone tumors produce behavioral changes in mice. It should be noted thatthese cell lines accurately represent the biological behavior of thetumors from which they were originally derived, and more broadly theclassification of more aggressive and less aggressive tumors.Furthermore, such properties have been verified independently by othergroups using one of these cell lines, and they generally extend to humanand murine osteosarcoma.

Beyond growth at the primary site, biological behavior can be quantifiedby metastatic propensity and successful spread to distant sites. Again,the predictions from the original spontaneous tumors were confirmedexperimentally in the models described herein. OS-2 cells were arepresentative example from a group of highly aggressive tumors (worseprognosis) that showed high expression of cell cycle and DNA damagerepair associated genes, with concomitant reduced expression of acomplement of genes that defined “microenvironment interactions. Thisreduced expression of molecules that mediate local cell communicationcould explain, at least in part, the observation that cells injectedintratibially achieved rapid systemic distribution, spreading to thelungs within 6 hours; i.e., there was nothing to hold the cells inplace, and they had no preference to remain in the local boneenvironment.

The results suggest that even though both OS-1 and OS-2 cell lines canestablish a metastatic niche, they do so with different kinetics,creating a suitable model to study intrinsic differences in metastaticpropensity, as well as host-related factors that contribute to themetastatic niche in OS.

Based on these observations, two distinct mechanisms for the differentmetastatic potential of OS-1 and OS-2 xenografts are herein proposed.OS-2 cells might have greater metastatic potential due to theirinteraction with the local microenvironment in the bone, which leads toreduced retention, and potentially to an increased capability tocondition the distant site. The alternative possibility is that OS-2cells seed the lungs shortly after inoculation, and even though many ofthese cells might leave the lungs or die, accounting for the loss ofluciferase signal by 24 hours, some cells remain and eventually form thepulmonary lesions (i.e., equivalent to seeding or colonization byintravenous inoculation). Preliminary experiments suggest that OS-1 andOS-2 cells have low efficiency of pulmonary colonization uponintravenous injection which would indicate the first possibility occurs.

Highly expressed mouse genes present in the OS-2 xenografts wereassociated with B cell signaling, inflammation, and immune response,whereas mouse genes in the OS-1 cells xenografts were associated withpatterning, and especially with muscle formation. Increased expressionof myogenic regulators in mouse stromal cells in OS-1 xenografts raisesinteresting questions regarding possible effects of OS-1 tumor cells onmarrow derived mesenchymal stromal cells.

Intriguingly, the most downregulated murine gene in the OS-2 xenograftswas the transcription factor Nkx2-1, which is known to regulate lungepithelial cell morphogenesis and differentiation. Down-regulation ofNKX2-1 has been shown to precede dissemination of lung adenocarcinomacells. NKX2-1 amplification has been reported in one human OS patientbut there are no reports of down regulation or loss of NKX2-1 in OSpatients.

Thus, xenograft models that recapitulated the heterogeneous biologicalbehavior of OS have been developed and have been described herein. Thesemodels may be useful to understand the mechanisms that drive progressionand metastasis of OS, as they are expandable into additional cell linesto represent a wider spectrum of disease.

Materials and Methods

Cells and culture conditions: Two canine OS cell lines representingpreviously described “highly aggressive” and “less aggressive” molecularphenotypes (OS-1 and OS-2), were used in this study. OS-1 and OS-2 arederivatives of the OSCA-32 and OSCA-40 and OSCA-40 cell lines.Specifically, OS-1 represents a subline that successfully establishedtumors after orthotopic implantation, as the parental OSCA-32 did notestablish heterotopic or orthotopic tumors in every occasion. OS-2represents the parental OSCA-40, which reliably formed tumors afterorthotopic implantation in every experiment done.

Cell lines were validated using STR Short Tandem Repeats (STR) profilesby DNA Diagnostics Center (DDC Medical) (Fairfield, Ohio). OS-1 and OS-2cells were modified to stably express green fluorescent protein (GFP)and firefly luciferase as described (Scott et al., 2015) and used fororthotopic injections in mice. After transfection and selection, it wasconfirmed that the GFP/luciferase construct was stably integrated ineach cell line by fluorescence in situ hybridization, and it wascorroborated that the two cell lines had approximately equivalentluciferase activity on a per cell basis using conventional luciferaseassays. All cell lines were grown in DMEM (Gibco, Grand Island, N.Y.)containing 5% glucose and L-glutamine, supplemented with 10% fetalbovine serum (Atlas Biologicals, Fort Collins, Colo.), 10 mM4-(2-hydroxyethyl)-1-piperazine ethanesulphonic acid buffer (HEPES) and0.1% Primocin (Invivogen, San Diego, Calif.) and cultured at 37° C. in ahumidified atmosphere of 5% CO₂. Canine OS cell lines are available fordistribution through Kerafast, Inc. (Boston, Mass.). Each cell line waspassaged more than 15 times before the experiments when they wereinoculated into mice.

Mice: Six-week old, female, athymic nude mice (strain NCr^(nu/nu)) wereobtained from Charles River Laboratories (Wilmington, Mass.). TheUniversity of Minnesota Institutional Animal Care and Use Committeeapproved protocols for mouse experiments of this study (Protocol No.:1307-30806A).

Tumor xenografts: Eight animals per group provide >95% power to identifya 15% change in the median time to tumor when the a for both populationsis <2.0 and the acceptable α error is 5% (p<0.05). Experimentalreplicates increased statistical robustness, accounting for the expectedheterogeneity.

Four replicate experiments were done to assess orthotopic growth andmetastatic dissemination of OS-1 and OS-2 cells. For the first pilotexperiment, groups of three mice were used to validate the approach. Allof the mice receiving OS-1 xenografts showed successful implantation,but only two of the three mice receiving OS-2 xenografts showedsuccessful implantation. For the second experiment, groups of 16 micewere used to establish significance. In this experiment, all of the micereceiving OS-2 xenografts showed successful implantation, but eight miceinjected with OS-1 xenografts had significant adverse effects duringanesthesia and were not recovered (i.e., they were humanelyeuthanatized). For the third experiment, nine mice were inoculated withOS-2 cells to verify the unexpected effects of rapid dissemination tothe lung. No mice received OS-1 for this experiment. Finally, for thefourth experiment, five mice were inoculated with each cell line (OS-1and OS-2) to achieve a biological replicate of experiment two,maintaining the sample size at a number to maximize a positive outcome.Appropriate censoring was used to include all animals in the analyses,only excluding any which succumbed acutely or subacutely during theintratibial injection procedure. Thus, 16 mice inoculated with OS-1 wereincluded in the analyses of tumor growth, and 32 mice inoculated withOS-2 were included in the analyses of tumor growth.

It was previously determined that four samples per group approximate thepoint of minimal returns using large genomic datasets for geneexpression profiling, and these estimates hold true from microarrays toRNAseq where the fidelity of replication within samples is high, despiteorders of magnitude more data (see analysis of RNA sequencing below).

Animals were assigned to separate cages (4 animals each) in random orderfor each experiment. All of the animals in each cage received the sametreatment. OS-1 and OS-2 cells expressing GFP and firefly luciferasewere injected intratibially. Mice were anesthetized with xylazine (10mg/kg, I.P.) and ketamine (100 mg/kg, I.P.), and 1×10⁵ cells suspendedin 10 μl of sterile PBS were injected into the left tibia using atuberculin syringe with 29-gauge needle. Buprenorphine (0.075 mg/kg,I.P. q.8 hours; Buprenex®, Reckitt Benckiser Healthcare, Richmond, Va.)was used for pain control over the first 24 hours after injection oftumor cells, and prophylactic ibuprofen administrated in the water wasused over the next 3.

Tumor growth was monitored by measuring width-and-length of the proximaltibia and the stifle joint weekly using calipers, as well as by in vivoimaging as described (Kim et al., 2014). Bioluminescence imaging(Xenogen IVIS spectrum, Caliper Life Sciences, Hopkinton, Mass.) wasdone after injection of D-luciferin (Gold biotechnology, St. Louis, Mo.)following isoflurane inhalant anesthesia and analyzed with Living ImageSoftware (Caliper Life Sciences). Bone tissue volume was calculated fromboth tibiae using the equation V=L×W²×0.52 (Banerjee et al., 2013) andtumor volume was estimated by subtracting the normal bone tissue volumeof the contralateral unaffected (right) tibia from the volume of theaffected (left) tibia.

Mice were observed for up to 8 weeks or until tumor endpoint criteriawere reached (ill thrift, tumor reaching 1 cm in the largest diameter,visible lameness, pain, or severe weight loss), at which time they werehumanely euthanized with pentobarbital sodium and sodium phenytoinsolution (Beuthanasia-D Special®, Schering-Plough Animal Health, Union,N.J.). Primary bone tumors and lung tissues were dissected and a portionof each was stored at −80° C. for RNA extraction. The remaining tissueswere fixed in 10% neutral-buffered formalin, and processed for routinehistological examination.

Luciferase activity and tumor sizes were compared using multiple t-testand Holm-Sidak method with Prism 6 software (GraphPad). p<0.05 was usedas the level of significance.

RNA extraction, library preparation, and RNA sequencing: Total RNA wasextracted from primary intratibial tumors and from cell lines usingmiRNeasy Mini Kit (QIAGEN, Valencia, Calif.). RNA integrity was examinedusing Agilent 2100 Bioanalyzer (Agilent Technologies, Santa Clara,Calif.) and RIN values of all samples were >8.0. Sequencing librarieswere prepared with TruSeq Library Preparation Kit (Illumina, San Diego,Calif.). RNA sequencing (100-bp paired-end) with HiSeq 2500 (Illumina)was done at the University of Minnesota Genomics Center (UMGC). Aminimum of ten million read-pairs was generated for each sample.

Analysis of RNA sequencing data: Initial quality control analysis of RNAsequencing (FASTQ) data for each sample was performed using the FastQCsoftware (version 0.11.2)(Andrews). FASTQ data were trimmed withTrimmomatic (Bolger, 2014). HISAT2 (Kim et al., 2015) was used to mappaired-end reads from eight xenograft tumors (four tumors of OS-1 andfour tumors of OS-2) and four parental cell line samples (two each forOS-1 and OS-2 cell lines). For accurate alignment of sequencing reads tocanine and murine genes within xenograft tumors a HISAT2 index formapping was built from a multi-sequence fasta file containing both thecanine (canFam3) and murine (mm10) genomes. Insertion size metrics werecalculated for each sample using Picard software (version 1.126)(http://picard.sourceforge.net.). Samtools (version 1.0_BCFTools_HTSLib)was used to sort and index the bam files (Li et al., 2009). Transcriptabundance estimates were generated using the Rsubread featureCountsprogram for differential gene expression analysis (Liao et al., 2014).

Gene counts for each xenograft sample were imported into RStudio (v.3.2.3) for differential gene expression (DGE) analysis with EdgeR. Lowlyexpressed genes were removed by filtering. A gene was consideredexpressed if had log 2-transformed read counts per million (CPM)>1 in atleast two of the eight xenograft tumors. Biological variation withinxenograft sample groups was estimated by common dispersion andbiological coefficient of variation (BCV) calculations. Pair-wiseempirical analysis of differential gene expression was performed onsample groups (OS-1 and OS-2) using ‘Exact Test’ for two-groupcomparisons with trimmed mean of M-values (TMM) normalization (Robinsonand Oshlack, 2010). Tagwise dispersion (individual dispersion for eachgene) was used to adjust for abundance differences across biologicalreplicates (n=4) within each xenograft group (OS-1 and OS-2). Genecounts as CPM, were imported into Partek Genomic Suite for clusteringanalysis and visualization. The Pearson similarity metric and averagelinkage clustering method were used for hierarchical clustering ofmean-centered CPM values. Enriched pathway and functional classificationanalyses of DGEs were performed using IPA. The reference set for all IPAanalyses was the Ingenuity Knowledge Base (genes only) and human Entrezgene names were used as the output format. To understand the high levelfunctions and utilities that each gene identified as differentiallyexpressed between OS-1 and OS-2 was associated with, Metacore software(Thompson Reuters) was used to identify statistically over-representedcellular processes in the dataset.

The following table, referred to above as Table S2, provides a list ofdifferentially-expressed murine genes in OS-2 relative to OS-1xenografts. Fold change (FC), p values, and FDR-adjusted p values werecalculated by pair-wise Exact Test comparisons in EdgeR. Genes wereannotated with the Ingenuity Knowledge Base of IPA.

Gene logFC PValue FDR Entrez Gene Name Location Type(s) Nkx2-1−10.97163756 3.67E−43 2.51E−39 NK2 homeobox 1 Nucleus transcriptionregulator Zic1 −10.37720965 2.80E−28 5.47E−25 Zic family member 1Nucleus transcription regulator Kcnq2 −7.93642633 1.90E−18 1.37E−15potassium channel, Plasma ion channel voltage gated KQT- Membrane likesubfamily Q, member 2 Smpd3 −5.937398394 4.58E−11 4.93E−09 sphingomyelinCytoplasm enzyme phosphodiesterase 3, neutral membrane (neutralsphingomyelinase II) Phex −5.479136674 3.98E−13 7.78E−11 phosphateregulating Cytoplasm peptidase endopeptidase homolog, X-linked Fam43b−5.16551588 7.72E−13 1.37E−10 family with Other other sequencesimilarity 43, member B Myoz2 −5.031353137 1.21E−14 3.75E−12 myozenin 2Other other Myh2 −5.016278218 1.42E−10 1.37E−08 myosin, heavy chainCytoplasm enzyme 2, skeletal muscle, adult Hoxd13 −5.005651203 6.42E−162.83E−13 homeobox D13 Nucleus transcription regulator Slcl3a5−4.983558827 3.64E−11 4.15E−09 solute carrier family Plasma transporter13 (sodium- Membrane dependent citrate transporter), member 5 Panx3−4.824377849 2.74E−06 7.53E−05 pannexin 3 Plasma other Membrane Tmem145−4.602792152 2.02E−10 1.92E−08 transmembrane Other other protein 145Lect1 −4.567514167 5.15E−09 3.25E−07 leukocyte cell Extracellular otherderived chemotaxin Space 1 Asic3 −4.473594151 2.73E−14 7.19E−12 acidsensing (proton Plasma ion channel gated) ion channel 3 Membrane Ankrd2−4.445048104 2.61E−19 2.23E−16 ankyrin repeat Nucleus transcriptiondomain 2 (stretch regulator responsive muscle) Fam180a −4.4449197861.04E−07 4.35E−06 family with Other other sequence similarity 180,member A Col9a1 −4.397295279 2.52E−11 3.08E−09 collagen, type IX,Extracellular other alpha 1 Space Fabp3 −4.367492217 2.88E−18 1.97E−15fatty acid binding Cytoplasm transporter protein 3, muscle and heartLyz1 −4.308114309 0.000102599 0.001637874 lysozyme Extracellular enzymeSpace Adamts18 −4.303169665 3.57E−09 2.42E−07 ADAM Extracellularpeptidase metallopeptidase Space with thrombospondin type 1 motif, 18Sp7 −4.224816491 1.29E−08 7.16E−07 Sp7 transcription Nucleustranscription factor regulator Omd −4.186098129 2.89E−17 1.58E−14osteomodulin Extracellular other Space 1700101I11Rik −4.1551683872.75E−07 9.92E−06 RIKEN cDNA Other other 1700101I11 gene Dlx6−4.109636472 3.22E−11 3.76E−09 distal-less homeobox Nucleustranscription 6 regulator Actn2 −4.069827028 5.15E−17 2.71E−14 actinin,alpha 2 Nucleus transcription regulator Tceal7 −4.038267592 4.23E−060.000108501 transcription Nucleus transcription elongation factor Aregulator (SII)-like 7 Xirp2 −4.014810326 6.23E−10 5.33E−08 xin actinbinding Other other repeat containing 2 Paqr6 −3.961900754 2.42E−102.28E−08 progestin and Plasma other adipoQ receptor Membrane familymember VI Csrp3 −3.936461238 8.12E−11 8.23E−09 cysteine and Nucleusother glycine-rich protein 3 (cardiac LIM protein) Fcrls −3.8106497912.99E−37 1.36E−33 Fc receptor-like S, Plasma other scavenger receptorMembrane Ckmt2 −3.806425077 2.10E−09 1.55E−07 creatine kinase, Cytoplasmkinase mitochondrial 2 (sarcomeric) Myl6b −3.787544034 9.99E−09 5.82E−07myosin, light chain Cytoplasm other 6B, alkali, smooth muscle and non-muscle Gli1 −3.740040126 9.38E−17 4.58E−14 GLI family zinc Nucleustranscription finger 1 regulator Zdbf2 −3.73484272 1.50E−07 6.03E−06zinc finger, DBF- Other other type containing 2 Opn1mw −3.7263451518.04E−07 2.55E−05 opsin 1 (cone Plasma G-protein pigments), long-Membrane coupled wave-sensitive receptor Gsg1l −3.72152491 9.85E−108.07E−08 GSG1-like Plasma other Membrane Abra −3.69798677 9.51E−131.65E−10 actin binding Rho Cytoplasm other activating protein Myom3−3.681916158 4.38E−08 2.07E−06 myomesin 3 Other other Serpinb1c−3.657917675 4.90E−11 5.20E−09 serine (or cysteine) Extracellular otherpeptidase inhibitor, Space clade B, member 1c Foxg1 −3.6483215741.21E−19 1.14E−16 forkhead box G1 Nucleus transcription regulator Ifitm5−3.617680152 2.81E−09 2.03E−07 interferon induced Plasma othertransmembrane Membrane protein 5 9130024F11Rik −3.490015735 2.34E−123.72E−10 RIKEN cDNA Other other 9130024F11 gene Csrnp3 −3.4502519121.48E−06 4.37E−05 cysteine-serine-rich Nucleus transcription nuclearprotein 3 regulator Slc47a1 −3.443994582 2.09E−06 5.96E−05 solutecarrier family Plasma transporter 47 (multidrug and Membrane toxinextrusion), member 1 2410137F16Rik −3.441108061 6.33E−07 2.07E−05 #N/A#N/A #N/A A930003A15Rik −3.43849311 7.11E−08 3.21E−06 RIKEN cDNA Otherother A930003A15 gene Col11a2 −3.43289575 9.67E−09 5.68E−07 collagen,type XI, Extracellular other alpha 2 Space Cst6 −3.420908168 2.33E−078.66E−06 cystatin E/M Extracellular other Space Actc1 −3.4006069161.73E−07 6.78E−06 actin, alpha, cardiac Cytoplasm enzyme muscle 1 Atp1b4−3.352991208 2.08E−05 0.000425726 ATPase, Na+/K+ Plasma transportertransporting, beta 4 Membrane polypeptide Alpk2 −3.341104185 9.43E−050.001527325 alpha-kinase 2 Nucleus kinase Myh1 −3.324236394 1.55E−091.18E−07 myosin, heavy chain Plasma enzyme 1, skeletal muscle, Membraneadult Hspb3 −3.317439048 1.86E−08 9.63E−07 heat shock 27 kDa Cytoplasmother protein 3 Itgb1bp2 −3.311620649 2.66E−09 1.93E−07 integrin beta 1Other other binding protein (melusin) 2 Casq2 −3.292086149 1.53E−088.17E−07 calsequestrin 2 Cytoplasm other (cardiac muscle) Dmp1−3.257994756 1.22E−09 9.51E−08 dentin matrix acidic Extracellular otherphosphoprotein 1 Space Plb1 −3.242320263 4.00E−07 1.39E−05 phospholipaseB1 Cytoplasm enzyme Cacna2d2 −3.241001086 1.16E−09 9.20E−08 calciumchannel, Plasma ion channel voltage-dependent, Membrane alpha 2/deltasubunit 2 AU022793 −3.198508581 2.72E−06 7.49E−05 expressed sequenceOther other AU022793 Mylpf −3.196418301 5.32E−07 1.77E−05 myosin lightchain, Cytoplasm other phosphorylatable, fast skeletal muscle Xirp1−3.193088545 6.65E−12 9.38E−10 xin actin binding Plasma other repeatcontaining 1 Membrane Dok7 −3.156216975 4.84E−09 3.08E−07 dockingprotein 7 Extracellular other Space Hsd11b2 −3.155020704 5.80E−105.06E−08 hydroxysteroid (11- Cytoplasm enzyme beta) dehydrogenase 2 Fat3−3.116929351 9.82E−07 3.04E−05 FAT atypical Plasma other cadherin 3Membrane Bex1 −3.098852594 5.93E−07 1.96E−05 brain expressed gene Otherother 1 Siglec1 −3.056322266 5.56E−36 1.52E−32 sialic acid bindingPlasma other Ig-like lectin 1, Membrane sialoadhesin Klhl30 −3.0426707743.66E−12 5.38E−10 kelch-like family Other other member 30 Srpk3−3.036559842 8.83E−10 7.33E−08 SRSF protein kinase Cytoplasm kinase 3Nmrk2 −3.036504708 9.23E−08 3.94E−06 nicotinamide Plasma kinase ribosidekinase 2 Membrane Hspb7 −3.035223935 4.33E−12 6.23E−10 heat shock 27 kDaCytoplasm other protein family, member 7 (cardiovascular) Hspb6−3.010038437 3.90E−11 4.38E−09 heat shock protein, Cytoplasm otheralpha-crystallin- related, B6 Myh3 −3.005412878 3.96E−09 2.63E−07myosin, heavy chain Cytoplasm enzyme 3, skeletal muscle, embryonic Zim1−2.999002281 5.28E−07 1.77E−05 zinc finger, Nucleus other imprinted 1Fhl1 −2.971716832 7.50E−12 1.05E−09 four and a half LIM Cytoplasm otherdomains 1 Cryab −2.942079665 4.57E−11 4.93E−09 crystallin, alpha BNucleus other Col26al −2.922599266 0.000140087 0.002118637 collagen,type Extracellular other XXVI, alpha 1 Space Tpm2 −2.901026545 3.06E−113.67E−09 tropomyosin 2, beta Cytoplasm other Smpx −2.8905869 3.29E−151.25E−12 small muscle Cytoplasm other protein, X-linked Mybpc1−2.889275402 7.51E−07 2.41E−05 myosin binding Cytoplasm other protein C,slow type Car3 −2.883690634 3.67E−09 2.48E−07 carbonic anhydraseCytoplasm enzyme III Myl3 −2.879903468 1.51E−05 0.000327184 myosin,light chain Cytoplasm other 3, alkali; ventricular, skeletal, slow Acta1−2.858497666 2.23E−07 8.33E−06 actin, alpha 1, Cytoplasm other skeletalmuscle Adprhl1 −2.839594188 3.39E−09 2.36E−07 ADP- Other enzymeribosylhydrolase like 1 Robo2 −2.838080162 1.25E−06 3.77E−05 roundaboutguidance Plasma transmembrane receptor 2 Membrane receptor Col9a2−2.836541279 9.14E−11 9.13E−09 collagen, type IX, Extracellular otheralpha 2 Space Frzb −2.822407749 2.37E−07 8.77E−06 frizzled-relatedExtracellular other protein Space Matn3 −2.818890834 2.10E−050.000429131 matrilin 3 Extracellular other Space Vgll2 −2.7799837343.56E−09 2.42E−07 vestigial-like family Nucleus transcription member 2regulator Alpl −2.774474841 3.78E−13 7.49E−11 alkaline Plasmaphosphatase phosphatase, Membrane liver/bone/kidney Cdo1 −2.7706361611.67E−08 8.76E−07 cysteine Cytoplasm enzyme dioxygenase type 1 Mfsd7c−2.762027564 6.69E−07 2.18E−05 feline leukemia virus Plasma transportersubgroup C cellular Membrane receptor family, member 2 Crhr2−2.736336233 1.20E−11 1.54E−09 corticotropin Plasma G-protein releasinghormone Membrane coupled receptor 2 receptor Myadml2 −2.7242624024.46E−06 0.00011354  myeloid- associated Cytoplasm other differentiationmarker-like 2 Pax2 −2.719600255 9.94E−06 0.000228268 paired box 2Nucleus transcription regulator Zic2 −2.715698644 1.17E−05 0.000263232Zic family member 2 Nucleus transcription regulator S100b −2.7099906061.76E−14 5.13E−12 S100 calcium Cytoplasm other binding protein B Synpo21−2.701441875 3.57E−09 2.42E−07 synaptopodin 2-like Cytoplasm otherCox6a2 −2.693936484 4.57E−07 1.54E−05 cytochrome c Cytoplasm enzymeoxidase subunit VIa polypeptide 2 Gm6524 −2.692410726 5.56E−050.000970923 katanin p60 Other other (ATPase-containing) subunit A1pseudogene Ccrl1 −2.676579274 2.47E−06 6.85E−05 #N/A #N/A #N/A Col22a1−2.674727987 3.90E−10 3.47E−08 collagen, type XXII, Extracellular otheralpha 1 Space Cav3 −2.673548764 1.73E−07 6.78E−06 caveolin 3 Plasmaenzyme Membrane Slc38a3 −2.654180586 6.41E−05 0.001092134 solute carrierfamily Plasma transporter 38, member 3 Membrane Tmem8c −2.653266678.29E−06 0.000193267 transmembrane Plasma other protein 8C MembraneKlhl41 −2.642347539 2.57E−07 9.39E−06 kelch-like family Cytoplasm othermember 41 Des −2.619769533 8.19E−12 1.11E−09 desmin Cytoplasm other Ldb3−2.619385875 7.34E−06 0.000173688 LIM domain binding Cytoplasmtransporter 3 Sbk2 −2.606775291 8.68E−05 0.001420753 SH3 domain bindingOther other kinase family, member 2 Popdc2 −2.58927308 2.98E−06 8.02E−05popeye domain Other other containing 2 Snca −2.588807158 4.55E−060.000115237 synuclein, alpha Cytoplasm other (non A4 component ofamyloid precursor) Ogn −2.586436671 3.77E−10 3.37E−08 osteoglycinExtracellular growth factor Space Lmod2 −2.574611571 7.58E−07 2.43E−05leiomodin 2 Other other (cardiac) Lepr −2.568144399 1.03E−08 5.96E−07leptin receptor Plasma transmembrane Membrane receptor Lrrc30−2.564534101 0.000103946 0.001655518 leucine rich repeat Other othercontaining 30 Tuba8 −2.563554458 0.000319672 0.004213331 tubulin, alpha8 Cytoplasm other Tceal5 −2.557601315 5.74E−05 0.000995638 transcriptionOther other elongation factor A (SII)-like 5 Myot −2.538095094 1.07E−050.000242622 myotilin Cytoplasm other Ndnf −2.537156126 6.13E−10 5.28E−08neuron-derived Extracellular other neurotrophic factor Space Ch25h−2.531462006 3.17E−12 4.77E−10 cholesterol 25- Cytoplasm enzymehydroxylase Lrtm1 −2.531422723 0.000304181 0.004044222 leucine-richrepeats Other other and transmembrane domains 1 Yipf7 −2.5205191963.17E−06 8.47E−05 Yip1 domain family, Other other member 7 Rsad2−2.52017011 3.91E−08 1.88E−06 radical S-adenosyl Cytoplasm enzymemethionine domain containing 2 Myl1 −2.508451421 8.20E−05 0.001356344myosin, light chain Cytoplasm other 1, alkali; skeletal, fast Gm10767−2.507994364 1.62E−05 0.000344605 predicted gene Other other 10767Col9a3 −2.503274415 0.00020731  0.002966738 collagen, type IX,Extracellular other alpha 3 Space Pdlim3 −2.502660067 9.22E−09 5.50E−07PDZ and LIM Plasma other domain 3 Membrane Tnnc2 −2.502251052 2.88E−050.000568941 troponin C type 2 Cytoplasm other (fast) Myom2 −2.4811407136.30E−07 2.07E−05 myomesin 2 Cytoplasm other Ccl4 −2.473254523 1.56E−091.18E−07 chemokine (C-C Extracellular cytokine motif) ligand 4 SpaceFgfr4 −2.467767258 5.43E−05 0.000952702 fibroblast growth Plasma kinasefactor receptor 4 Membrane Hand2 −2.466208146 5.90E−05 0.001018934 heartand neural Nucleus transcription crest derivatives regulator expressed 2Ppargc1a −2.461171212 2.96E−05 0.000580045 peroxisome Nucleustranscription proliferator- regulator activated receptor gamma,coactivator 1 alpha Asb12 −2.455387214 9.25E−05 0.001503012 ankyrinrepeat and Nucleus transcription SOCS box regulator containing 12 Klhl40−2.453997338 1.87E−08 9.68E−07 kelch-like family Other other member 40Hspa1l −2.448496231 7.22E−10 6.10E−08 heat shock 70 kDa Cytoplasm otherprotein 1-like Srl −2.437036897 2.82E−06 7.69E−05 sarcalumenin Cytoplasmother Fndc5 −2.434034744 6.89E−06 0.000164402 fibronectin type III Otherother domain containing 5 Tnnt3 −2.43394736 4.26E−05 0.000786196troponin T type 3 Cytoplasm other (skeletal, fast) Greb1 −2.4265030141.19E−08 6.75E−07 growth regulation by Cytoplasm other estrogen inbreast cancer 1 I830012O16Rik −2.419437095 0.000180549 0.002630558 #N/A#N/A #N/A Sox11 −2.416593585 3.33E−09 2.34E−07 SRY (sex Nucleustranscription determining region regulator Y)-box 11 Nrcam −2.4099940271.07E−05 0.000242622 neuronal cell Plasma other adhesion moleculeMembrane Foxl1 −2.40842338 5.02E−06 0.000124633 forkhead box L1 Nucleustranscription regulator Foxc1 −2.3807345 1.25E−19 1.14E−16 forkhead boxC1 Nucleus transcription regulator Tuba4a −2.367985915 4.07E−07 1.41E−05tubulin, alpha 4a Cytoplasm other Tcap −2.362652632 7.01E−07 2.27E−05titin-cap Cytoplasm other B430306N03Rik −2.351177563 4.46E−07 1.51E−05RIKEN cDNA Other other B430306N03 gene Cap2 −2.351093169 7.43E−083.31E−06 CAP, adenylate Plasma other cyclase-associated Membraneprotein, 2 (yeast) Ucp3 −2.345743121 3.11E−05 0.00060655  uncouplingprotein 3 Cytoplasm transporter (mitochondrial, proton carrier) Dmrta2−2.338686972 1.16E−06 3.53E−05 DMRT-like family Nucleus transcription A2regulator Fgfr3 −2.337098522 2.07E−11 2.60E−09 fibroblast growth Plasmakinase factor receptor 3 Membrane Mapt −2.331242707 2.08E−08 1.06E−06microtubule- Plasma other associated protein Membrane tau Fgfr2−2.32142915 2.02E−05 0.000418907 fibroblast growth Plasma kinase factorreceptor 2 Membrane Hhatl −2.311519505 2.06E−05 0.000424294 hedgehogCytoplasm enzyme acyltransferase-like Jsrp1 −2.307700559 7.35E−083.30E−06 junctional Cytoplasm other sarcoplasmic reticulum protein 1Ppm1e −2.302246505 7.68E−07 2.45E−05 protein phosphatase, Nucleusphosphatase Mg2+/Mn2+ dependent, 1E Flnc −2.296173473 1.55E−07 6.21E−06filamin C, gamma Cytoplasm other Smad9 −2.295442355 7.21E−06 0.000171229SMAD family Nucleus transcription member 9 regulator Alpk3 −2.2863909287.03E−07 2.27E−05 alpha-kinase 3 Nucleus kinase Npr3 −2.281738345.37E−07 1.79E−05 natriuretic peptide Plasma G-protein receptor 3Membrane coupled receptor Fras1 −2.279123226 1.29E−07 5.32E−06 Fraserextracellular Extracellular other matrix complex Space subunit 1 Cmpk2−2.278794093 6.99E−06 0.000166661 cytidine Cytoplasm kinasemonophosphate (UMP-CMP) kinase 2, mitochondrial Rbp7 −2.2760499231.43E−10 1.38E−08 retinol binding Cytoplasm other protein 7, cellularPopdc3 −2.270239846 1.86E−07 7.13E−06 popeye domain Other othercontaining 3 Dusp26 −2.262773674 1.97E−05 0.000409563 dual specificityCytoplasm enzyme phosphatase 26 (putative) Slc28a2 −2.26166738 3.16E−068.47E−05 solute carrier family Plasma transporter 28 (concentrativeMembrane nucleoside transporter), member 2 Smyd1 −2.257820234 1.29E−050.000287231 SET and MYND Nucleus transcription domain containing 1regulator Tbx1 −2.253565715 3.52E−09 2.42E−07 T-box 1 Nucleustranscription regulator Tnni2 −2.250955521 0.000114204 0.001787671troponin I type 2 Cytoplasm enzyme (skeletal, fast) Ccl3 −2.2493667219.28E−05 0.001506348 chemokine (C-C Extracellular cytokine motif) ligand3-like Space 3 Slc16a4 −2.247378871 7.91E−05 0.0013143  solute carrierfamily Plasma transporter 16, member 4 Membrane 3425401B19Rik−2.245394181 7.12E−06 0.000169305 chromosome 10 Other other open readingframe 71 Lrtm2 −2.237129762 0.000265996 0.003635462 leucine-rich repeatsOther other and transmembrane domains 2 Sult1a1 −2.236886147 3.44E−069.05E−05 sulfotransferase Cytoplasm enzyme family 1A, phenol-preferring, member 1 Nrap −2.225366834 3.52E−06 9.21E−05 nebulin-relatedCytoplasm other anchoring protein Cacna1s −2.216075785 1.08E−050.000245159 calcium channel, Plasma ion channel voltage-dependent,Membrane L type, alpha 1S subunit Mum1l1 −2.213497637 0.0001791720.002616065 melanoma Cytoplasm other associated antigen (mutated) 1-like1 Hk3 −2.213406939 2.77E−12 4.31E−10 hexokinase 3 (white Cytoplasmkinase cell) Camk2b −2.209496356 6.55E−09 3.98E−07 calcium/calmodulin-Cytoplasm kinase dependent protein kinase II beta Lamc3 −2.2081224165.95E−05 0.001026687 laminin, gamma 3 Extracellular other Space Wnt10b−2.205958028 2.75E−06 7.53E−05 wingless-type Extracellular other MMTVintegration Space site family, member 10B Fam107 a −2.2041088440.000258693 0.003553395 family with Nucleus other sequence similarity107, member A 2310002L09Rik −2.194169008 5.30E−05 0.000933799 RIKEN cDNACytoplasm other 2310002L09 gene Meis1 −2.193357038 1.42E−08 7.72E−07Meis homeobox 1 Nucleus transcription regulator Trdn −2.1922612484.81E−06 0.000120606 triadin Cytoplasm other Mlip −2.187237572 3.41E−068.99E−05 muscular LMNA- Nucleus other interacting protein Sh3bgr−2.183033333 0.000248973 0.003433665 SH3-binding domain Cytoplasm otherglutamic acid-rich protein Prkag3 −2.181939396 0.0001064  0.001682844protein kinase, Cytoplasm other AMP-activated, gamma 3 non- catalyticsubunit Cacng1 −2.168663866 1.05E−06 3.25E−05 calcium channel, Plasmaion channel voltage-dependent, Membrane gamma subunit 1 Sypl2−2.167562914 0.000345737 0.004491952 synaptophysin-like 2 Other otherHspb1 −2.159559855 4.85E−08 2.26E−06 heat shock 27 kDa Cytoplasm otherprotein 1 Dusp27 −2.133894504 4.49E−09 2.91E−07 dual specificity Otherphosphatase phosphatase 27 (putative) Notum −2.119495301 6.47E−060.000156084 notum Extracellular other pectinacetylesterase Space homolog(Drosophila) Pdk4 −2.119339262 9.60E−08 4.04E−06 pyruvate Cytoplasmkinase dehydrogenase kinase, isozyme 4 Myo18b −2.119301176 1.98E−065.66E−05 myosin XVIIIB Cytoplasm other Trim72 −2.115143227 1.68E−076.67E−06 tripartite motif Cytoplasm enzyme containing 72, E3 ubiquitinprotein ligase 1500017E21Rik −2.111456775 0.00022252  0.003148182 RIKENcDNA Other other 1500017E21 gene Cnih2 −2.099012367 9.24E−06 0.000213856cornichon family Extracellular other AMPA receptor Space auxiliaryprotein 2 Mustn1 −2.092811522 1.30E−06 3.91E−05 musculoskeletal, Nucleusother embryonic nuclear protein 1 Rbm20 −2.092417242 1.29E−050.000287162 RNA binding motif Nucleus other protein 20 Casq1−2.090786079 0.000388851 0.004925805 calsequestrin 1 (fast- Cytoplasmother twitch, skeletal muscle) H19 −2.08747205 7.92E−13 1.39E−10 H19,imprinted Cytoplasm other maternally expressed transcript (non- proteincoding) Tlr7 −2.071501831 8.33E−08 3.64E−06 toll-like receptor 7 Plasmatransmembrane Membrane receptor Kcnc3 −2.069589796 1.37E−06 4.09E−05potassium channel, Plasma ion channel voltage gated Shaw Membranerelated subfamily C, member 3 Twist1 −2.066523744 1.16E−16 5.49E−14twist family bHLH Nucleus transcription transcription factor 1 regulatorGalnt3 −2.060068511 4.79E−05 0.000861365 polypeptide N- Cytoplasm enzymeacetylgalactosaminyl transferase 3 Aldoart2 −2.056238854 0.0002867330.003853432 aldolase 1 A, Other enzyme retrogene 2 Bves −2.0526262251.58E−05 0.00033814  blood vessel Plasma other epicardial substanceMembrane Myf6 −2.041661831 0.000242965 0.0033727  myogenic factor 6Nucleus transcription (herculin) regulator Sgms2 −2.041025291 2.31E−050.000467309 sphingomyelin Plasma enzyme synthase 2 Membrane Mrc1−2.038914921 2.65E−15 1.04E−12 mannose receptor, C Plasma transmembraitype 1 Membrane e receptor Slc8a3 −2.036140803 1.48E−06 4.37E−05 solutecarrier family Plasma transporter 8 (sodium/calcium Membrane exchanger),member 3 Mx1 −2.032963249 4.78E−05 0.000859613 MX dynamin-like Nucleusenzyme GTPase 1 Dlx5 −2.013607351 9.59E−05 0.00154859  distal-lesshomeobox Nucleus transcription 5 regulator Cd180 −1.999988899 3.23E−092.28E−07 CD180 molecule Plasma other Membrane Hspb2 −1.9991448352.17E−06 6.09E−05 heat shock 27 kDa Cytoplasm other protein 2 Penk−1.992073589 5.76E−05 0.000997614 proenkephalin Extracellular otherSpace Phospho1 −1.974838655 3.91E−05 0.000737067 phosphatase, orphanExtracellular enzyme 1 Space Colq −1.971439149 2.18E−05 0.000441925collagen-like tail Extracellular other subunit (single Space strand ofhomotrimer) of asymmetric acetylcholinesterase Myom1 −1.9644311438.37E−05 0.001377131 myomesin 1 Cytoplasm other Eef1a2 −1.9639243986.39E−06 0.00015481  eukaryotic Cytoplasm translation translationregulator elongation factor 1 alpha 2 Ovol1 −1.96297186 3.25E−050.000631034 ovo-like zinc finger Nucleus transcription 1 regulator Lrrc2−1.962718812 3.67E−06 9.57E−05 leucine rich repeat Other othercontaining 2 Ccl12 −1.960416694 4.35E−09 2.83E−07 chemokine (C-CExtracellular cytokine motif) ligand 2 Space Otud1 −1.9582093 1.43E−087.74E−07 OTU deubiquitinase Other peptidase 1 Lonrf3 −1.9558398166.45E−09 3.96E−07 LON peptidase N- Other other terminal domain and ringfinger 3 Bai1 −1.955724654 0.000100996 0.001616059 #N/A #N/A #N/A Hoxc9−1.955660858 1.01E−12 1.72E−10 homeobox C9 Nucleus transcriptionregulator Arpp21 −1.945185646 0.000233277 0.003266598 cAMP-regulatedCytoplasm other phosphoprotein, 21 kDa Obscn −1.939029162 0.0002063890.002959758 obscurin, Cytoplasm kinase cytoskeletal calmodulin andtitin- interacting RhoGEF Trem2 −1.936446063 4.42E−08 2.09E−06triggering receptor Plasma transmembrane expressed on Membrane receptormyeloid cells 2 Tpm1 −1.933315987 6.35E−05 0.001083589 tropomyosin 1,alpha Plasma other Membrane Mb −1.927240981 4.29E−10 3.79E−08 myoglobinCytoplasm transporter Coro6 −1.923052355 9.30E−05 0.001506695 coronin 6Extracellular other Space Satb2 −1.922158855 0.000113421 0.001777453SATB homeobox 2 Nucleus transcription regulator Dlgap3 −1.9213737130.00034172  0.00444821  discs, large Cytoplasm other (Drosophila)homolog-associated protein 3 Ptn −1.91531434 0.000166962 0.002471403pleiotrophin Extracellular growth factor Space Bmp5 −1.9053999223.70E−05 0.000700196 bone morphogenetic Extracellular growth factorprotein 5 Space Ttn −1.901348241 0.000364585 0.004670309 titin Cytoplasmkinase Art1 −1.901283303 1.60E−05 0.000340711 ADP- Plasma enzymeribosyltransferase 1 Membrane Sybu −1.900896598 7.80E−06 0.000182995syntabulin (syntaxin- Other other interacting) Tex15 −1.9003784171.57E−05 0.000337967 testis expressed 15 Extracellular other Space Wnt5a1.906940595 1.25E−07 5.19E−06 wingless-type Extracellular cytokine MMTVintegration Space site family, member 5A Ero1l 1.910974733 1.72E−112.18E−09 endoplasmic Cytoplasm enzyme reticulum oxidoreductase alphaCyp7b1 1.913128075 6.13E−05 0.001053765 cytochrome P450, Cytoplasmenzyme family 7, subfamily B, polypeptide 1 Timp1 1.914391071 1.11E−098.91E−08 TIMP Extracellular cytokine metallopeptidase Space inhibitor 1Bhlhe22 1.920077447 0.00024332  0.0033727  basic helix-loop- Nucleustranscription helix family, regulator member e22 Clca5 1.9222526460.000128997 0.001980076 #N/A #N/A #N/A Nos2 1.93449052 9.69E−060.000223586 nitric oxide synthase Cytoplasm enzyme 2, inducible Sdc11.934765043 1.90E−12 3.13E−10 syndecan 1 Plasma enzyme Membrane Cel111.935685485 1.01E−05 0.00023188  chemokine (C-C Extracellular cytokinemotif) ligand 11 Space Sfrp2 1.937570432 0.00010539  0.001670733secreted frizzled- Plasma transmembrane related protein 2 Membranereceptor Adora2b 1.937719318 1.43E−06 4.25E−05 adenosine A2b PlasmaG-protein receptor Membrane coupled receptor C1rb 1.948311191 6.64E−060.000159392 complement Extracellular peptidase component 1, r Spacesubcomponent Cadm3 1.954868801 1.65E−06 4.81E−05 cell adhesion Plasmaother molecule 3 Membrane Gcnt4 1.959146422 5.60E−05 0.000974542glucosaminyl (N- Cytoplasm enzyme acetyl) transferase 4, core 2 AA4671971.95975912 0.000134462 0.002053093 chromosome 15 Nucleus other openreading frame 48 Adamts5 1.96550313 1.70E−12 2.84E−10 ADAM Extracellularpeptidase metallopeptidase Space with thrombospondin type 1 motif, 5 Il61.96970782 3.32E−05 0.000642426 interleukin 6 Extracellular cytokineSpace Acp5 1.97208448 1.73E−05 0.000365845 acid phosphatase 5, Cytoplasmphosphatase tartrate resistant Plac8 1.972654229 1.03E−06 3.18E−05placenta-specific 8 Nucleus other Hic1 1.977666371 2.86E−10 2.64E−08hypermethylated in Nucleus transcription cancer 1 regulator Il18rap1.988047069 1.49E−05 0.000324778 interleukin 18 Plasma transmembranereceptor accessory Membrane receptor protein Prss46 2.0050160820.000152543 0.002288306 protease, serine, 46 Other peptidase Csgalnact12.006832892 2.07E−12 3.33E−10 chondroitin sulfate Cytoplasm enzyme N-acetylgalactosaminyl transferase 1 Phlda2 2.012095452 0.0001187380.001848071 pleckstrin Cytoplasm other homology-like domain, family A,member 2 Barx2 2.013964382 1.83E−06 5.25E−05 BARX homeobox 2 Nucleustranscription regulator Kctd11 2.020123243 8.08E−12 1.11E−09 potassiumchannel Cytoplasm other tetramerization domain containing 11 Hilpda2.022779424 7.35E−08 3.30E−06 hypoxia inducible Cytoplasm other lipiddroplet- associated Klhdc8a 2.029163571 7.64E−06 0.000180115 kelchdomain Other other containing 8A Crabp2 2.041000695 3.54E−05 0.000676149cellular retinoic acid Cytoplasm transporter binding protein 2 Medag2.044971831 1.99E−12 3.24E−10 mesenteric estrogen- Cytoplasm otherdependent adipogenesis Napsa 2.050808167 6.69E−08 3.03E−06 napsin Aaspartic Extracellular peptidase peptidase Space Col23a1 2.0746154766.01E−09 3.72E−07 collagen, type Plasma other XXIII, alpha 1 MembraneWnt2b 2.077564459 1.78E−05 0.000373778 wingless-type Extracellular otherMMTV integration Space site family, member 2B Lgi3 2.0831858980.000201013 0.002897856 leucine-rich repeat Extracellular other LGIfamily, member Space 3 Il33 2.084646455 5.04E−11 5.30E−09 interleukin 33Extracellular cytokine Space H2-Ab1 2.087468297 1.75E−09 1.30E−07 majorPlasma other histocompatibility Membrane complex, class II, DQ beta 14930502E18Rik 2.087590606 5.50E−05 0.000961458 RIKEN cDNA Other other4930502E18 gene Osr1 2.114093041 1.84E−08 9.56E−07 odd-skipped relatedNucleus other transciption factor 1 Serping1 2.116258617 1.60E−133.43E−11 serpin peptidase Extracellular other inhibitor, clade G Space(C1 inhibitor), member 1 P2ry10 2.117660014 6.16E−05 0.001055499purinergic receptor Plasma G-protein P2Y, G-protein Membrane coupledcoupled, 10 receptor Ddit4 2.120727355 1.04E−15 4.32E−13 DNA-damage-Cytoplasm other inducible transcript 4 Tmeff2 2.123849758 0.0002865920.003853432 transmembrane Cytoplasm other protein with EGF- like and twofollistatin-like domains 2 Pthlh 2.12599575 3.81E−05 0.000719195parathyroid Extracellular other hormone-like Space hormone Pla1a2.128297502 3.15E−12 4.77E−10 phospholipase A1 Extracellular enzymemember A Space Cwc22 2.131128484 0.000289077 0.003873516 CWC22 Nucleusother spliceosome- associated protein Adamts4 2.131910626 9.89E−121.30E−09 ADAM Extracellular peptidase metallopeptidase Space withthrombospondin type 1 motif, 4 Ocstamp 2.133707622 0.0002859090.003849919 osteoclast Other other stimulatory transmembrane proteinAvpr1a 2.135058799 3.05E−08 1.49E−06 arginine vasopressin PlasmaG-protein receptor 1A Membrane coupled receptor Sphk1 2.1375776275.04E−10 4.42E−08 sphingosine kinase 1 Cytoplasm kinase Alox122.147703459 7.02E−05 0.001179092 arachidonate 12- Cytoplasm enzymelipoxygenase Cd74 2.154265386 8.23E−10 6.87E−08 CD74 molecule, Plasmatransmembrane major Membrane receptor histocompatibility complex, classII invariant chain Ier3 2.156413161 6.99E−10 5.94E−08 immediate earlyCytoplasm other response 3 Niacr1 2.161017459 4.17E−06 0.000107108 #N/A#N/A #N/A Galnt16 2.163332213 1.33E−11 1.70E−09 polypeptide N- Cytoplasmenzyme acetylgalactosaminyl transferase 16 Fam83f 2.163464457 9.66E−050.001557314 family with Other other sequence similarity 83, member FPhyhipl 2.166920709 0.000352603 0.004552696 phytanoyl-CoA 2- Cytoplasmother hydroxylase interacting protein- like H2-Aa 2.16974298 3.79E−092.53E−07 major Plasma transmembrane histocompatibility Membrane receptorcomplex, class II, DQ alpha 1 Il1rl1 2.175512643 4.51E−06 0.000114652interleukin 1 Plasma transmembrane receptor-like 1 Membrane receptor Dpt2.180012546 2.29E−13 4.74E−11 dermatopontin Extracellular other SpaceKcnjl5 2.180673205 1.37E−08 7.48E−07 potassium channel, Plasma ionchannel inwardly rectifying Membrane subfamily J, member 15 Rnd12.181967661 9.57E−09 5.64E−07 Rho family GTPase Cytoplasm enzyme 1Gpr114 2.189646297 1.94E−07 7.41E−06 #N/A #N/A #N/A Ccbp2 2.1939046352.20E−07 8.27E−06 #N/A #N/A #N/A Elfn1 2.199467366 4.19E−05 0.000776232extracellular leucine- Plasma other rich repeat and Membrane fibronectintype III domain containing 1 Cxadr 2.20082858 0.000332993 0.004351168coxsackie virus and Plasma transmembrane adenovirus receptor Membranereceptor Mcpt4 2.206254485 0.000169343 0.002496531 mast cell protease 4Other peptidase Stac2 2.212366039 9.24E−09 5.50E−07 SH3 and cysteineOther other rich domain 2 Cxcr7 2.216406879 4.59E−14 1.16E−11 #N/A #N/A#N/A Foxd1 2.217141687 2.09E−05 0.000427216 forkhead box D1 Nucleustranscription regulator Cd209f 2.232430603 0.000140148 0.002118637CD209f antigen Other other Crabp1 2.235053806 0.000377869 0.004813427cellular retinoic acid Cytoplasm transporter binding protein 1 Rtn4rl22.236586861 9.06E−08 3.91E−06 reticulon 4 receptor- Plasma other like 2Membrane Slc39a14 2.238458836 2.66E−14 7.14E−12 solute carrier familyPlasma transporter 39 (zinc transporter), Membrane member 14 Ifnlr12.242774024 1.94E−05 0.000403681 interferon, lambda Plasma transmembranereceptor 1 Membrane receptor 5730416F02Rik 2.253801229 5.43E−060.00013307  capping protein Other other (actin filament), gelsolin-likepseudogene Trpm6 2.25769376 2.03E−07 7.73E−06 transient receptor Plasmakinase potential cation Membrane channel, subfamily M, member 6 Gfra12.258361982 1.82E−06 5.23E−05 GDNF family Plasma transmembrane receptoralpha 1 Membrane receptor Egln3 2.26022722 4.82E−15 1.69E−12 egl-9family Cytoplasm enzyme hypoxia-inducible factor 3 S100a9 2.2610770770.000236631 0.003306796 S100 calcium Cytoplasm other binding protein A9Fbln2 2.261656935 3.57E−10 3.21E−08 fibulin 2 Extracellular other SpaceTnfsf11 2.262220766 0.000168565 0.002489291 tumor necrosis Extracellularcytokine factor (ligand) Space superfamily, member 11 S1pr3 2.268064898.22E−08 3.62E−06 sphingosine-1- Plasma G-protein phosphate receptor 3Membrane coupled receptor Acsbg1 2.27206321 2.08E−05 0.000425956acyl-CoA synthetase Cytoplasm enzyme bubblegum family member 1 Kcne32.28228483 7.11E−11 7.32E−09 potassium channel, Plasma ion channelvoltage gated Membrane subfamily E regulatory beta subunit 3 Lmx1a2.286295067 8.74E−05 0.001429283 LIM homeobox Nucleus transcriptiontranscription factor regulator 1, alpha Sfrp1 2.286645932 0.0003828330.004863077 secreted frizzled- Plasma transmembrane related protein 1Membrane receptor Aqp2 2.294873359 3.68E−05 0.000699026 aquaporin 2Plasma transporter (collecting duct) Membrane 1810033B17Rik 2.3037509071.76E−05 0.000371982 #N/A #N/A #N/A Tmem178 2.30838751 8.85E−060.000205126 transmembrane Other other protein 178A Figf 2.3246159123.70E−09 2.48E−07 c-fos induced Extracellular growth factor growthfactor Space (vascular endothelial growth factor D) Slc6a2 2.3273864485.24E−05 0.000926129 solute carrier family Plasma transporter 6(neurotransmitter Membrane transporter), member 2 Gpr123 2.3334080542.32E−07 8.66E−06 #N/A #N/A #N/A Ces2g 2.337823509 2.07E−05 0.000424294carboxylesterase 2G Other enzyme Treml4 2.356184647 7.19E−05 0.00120497 triggering receptor Other other expressed on myeloid cells-like 4 Doc2b2.356263534 0.000143893 0.002170448 double C2-like Cytoplasm transporterdomains, beta Lbp 2.35803444 5.08E−13 9.65E−11 lipopolysaccharide Plasmatransporter binding protein Membrane Ifi205 2.360747496 5.29E−141.32E−11 interferon, gamma- Nucleus transcription inducible protein 16regulator Rgs9 2.360817111 6.48E−05 0.001101485 regulator of G-Cytoplasm enzyme protein signaling 9 Arsi 2.371268625 2.16E−08 1.09E−06arylsulfatase family, Extracellular enzyme member I Space Ciita2.373772401 6.36E−07 2.08E−05 class II, major Nucleus transcriptionhistocompatibility regulator complex, transactivator Dusp4 2.3793883462.36E−06 6.57E−05 dual specificity Nucleus phosphatase phosphatase 4Rorb 2.38089784 0.000295309 0.003937748 RAR-related orphan Nucleusligand- receptor B dependent nuclear receptor Sbsn 2.383926086 1.80E−050.000377684 suprabasin Cytoplasm other Cdh1 2.392769181 3.51E−081.69E−06 cadherin 1, type 1 Plasma other Membrane Fgr 2.3959977221.73E−17 9.84E−15 FGR proto- Nucleus kinase oncogene, Src familytyrosine kinase Kcnip1 2.401474217 1.95E−05 0.000405805 Kv channelPlasma ion channel interacting protein 1 Membrane Ak4 2.4018466995.33E−08 2.46E−06 adenylate kinase 4 Cytoplasm kinase A630023A22Rik2.414937431 7.63E−05 0.001272883 RIKEN cDNA Other other A630023A22 geneHas1 2.417424502 1.62E−08 8.57E−07 hyaluronan synthase Plasma enzyme 1Membrane Sdk1 2.424720594 1.24E−08 6.91E−07 sidekick cell Plasma otheradhesion molecule 1 Membrane Gjb5 2.429127176 6.04E−06 0.000146855 gapjunction protein, Plasma transporter beta 5, 31.1kDa Membrane5730559C18Rik 2.434932765 5.17E−05 0.000919826 chromosome 1 open Otherother reading frame 106 Adm 2.437796473 3.10E−09 2.21E−07 adrenomedullinExtracellular other Space Hmga2 2.445141593 1.35E−05 0.000297813 highmobility group Nucleus enzyme AT-hook 2 Itgax 2.449058631 3.26E−161.49E−13 integrin, alpha X Plasma transmembrane (complement Membranereceptor component 3 receptor 4 subunit) Itln1 2.454578666 3.52E−071.24E−05 intelectin 1 Plasma other (galactofuranose Membrane binding)Ifitm1 2.454902535 1.49E−13 3.24E−11 interferon induced Other othertransmembrane protein 1 Sh2d5 2.457966113 5.17E−13 9.68E−11 SH2 domainPlasma other containing 5 Membrane Ndufa4l2 2.461273301 4.04E−125.88E−10 NADH Other enzyme dehydrogenase (ubiquinone) 1 alphasubcomplex, 4-like 2 Ffar2 2.472209214 0.000226332 0.003185641 freefatty acid Plasma G-protein receptor 2 Membrane coupled receptor Scd12.488851454 3.66E−07 1.28E−05 stearoyl-CoA Cytoplasm enzyme desaturase(delta-9- desaturase) Mmp9 2.499142569 4.69E−05 0.000845283 matrixExtracellular peptidase metallopeptidase 9 Space Cxcl1 2.515376543.67E−05 0.000696707 chemokine (C-X-C Extracellular cytokine motif)ligand 2 Space Nrn1 2.520515557 2.96E−06 8.01E−05 neuritin 1 Cytoplasmother Inhbb 2.523010814 5.62E−13 1.04E−10 inhibin, beta B Extracellulargrowth factor Space Col28a1 2.531182459 9.55E−07 2.97E−05 collagen, typeExtracellular other XXVIII, alpha 1 Space Dnmt31 2.532511137 0.0002695060.003668764 DNA (cytosine-5-)- Nucleus transcription methyltransferase3- regulator like Fcrla 2.535289213 0.000140017 0.002118637 Fcreceptor-like A Plasma other Membrane Cxcl14 2.538668396 1.33E−091.04E−07 chemokine (C-X-C Extracellular cytokine motif) ligand 14 SpacePi16 2.545417685 9.68E−08 4.06E−06 peptidase inhibitor Extracellularother 16 Space C4b 2.554327389 3.52E−09 2.42E−07 complementExtracellular other component 4B Space (Chido blood group) Gzmc2.586719596 2.23E−06 6.24E−05 granzyme C Cytoplasm peptidase Car92.588221576 5.94E−09 3.70E−07 carbonic anhydrase Nucleus enzyme IXH2-Eb1 2.589361992 4.28E−11 4.72E−09 major Plasma transmembranehistocompatibility Membrane receptor complex, class II, DR beta 5 Cpa32.592224708 0.000349947 0.004527105 carboxypeptidase A3 Extracellularpeptidase (mast cell) Space Rhov 2.597158136 3.97E−05 0.000744269 rashomolog family Plasma enzyme member V Membrane Smoc1 2.6092766591.13E−17 7.01E−15 SPARC related Extracellular other modular calciumSpace binding 1 Cd244 2.610625963 2.48E−09 1.82E−07 CD244 molecule,Plasma transmembrane natural killer cell Membrane receptor receptor 2B4Serpina3h 2.626086546 9.48E−16 4.05E−13 serine (or cysteine)Extracellular other peptidase inhibitor, Space clade A, member 3H Dpp62.627269895 3.28E−06 8.69E−05 dipeptidyl-peptidase Plasma other 6Membrane Tmem95 2.630604846 3.26E−06 8.64E−05 transmembrane Other otherprotein 95 Rgs16 2.641503944 7.07E−14 1.61E−11 regulator of G- Cytoplasmother protein signaling 16 Mmp12 2.661387138 8.47E−12 1.14E−09 matrixExtracellular peptidase metallopeptidase 12 Space Ttyh1 2.6663102691.17E−08 6.67E−07 tweety family Plasma ion channel member 1 MembraneTmem125 2.684592738 0.00020899  0.002984538 transmembrane Other otherprotein 125 Pcsk5 2.688181091 2.88E−12 4.42E−10 proprotein Extracellularpeptidase convertase Space subtilisin/kexin type 5 Slc2a1 2.7010769562.95E−13 5.94E−11 solute carrier family Plasma transporter 2(facilitated glucose Membrane transporter), member 1 Frmd5 2.7042557717.70E−05 0.001282835 FERM domain Other other containing 5 Col5a32.706994884 9.13E−22 9.61E−19 collagen, type V, Extracellular otheralpha 3 Space Dmkn 2.711140688 0.00034439  0.004478701 dermokineExtracellular other Space Lrrc15 2.71712803 3.40E−12 5.06E−10 leucinerich repeat Plasma other containing 15 Membrane C3 2.726900819 2.87E−102.64E−08 complement Extracellular peptidase component 3 Space Nt5e2.732947213 7.62E−12 1.05E−09 5′-nucleotidase, ecto Plasma phosphatase(CD73) Membrane Serpind1 2.762623986 9.29E−07 2.91E−05 serpin peptidaseExtracellular other inhibitor, clade D Space (heparin cofactor), member1 Unc13a 2.772685031 1.69E−09 1.27E−07 unc-13 homolog A Plasma other (C.elegans) Membrane Tpsb2 2.780651006 8.40E−08 3.65E−06 tryptasealpha/beta 1 Extracellular peptidase Space Inhba 2.795338332 3.18E−081.54E−06 inhibin, beta A Extracellular growth factor Space C4a2.798731104 2.61E−10 2.44E−08 complement Extracellular other component4B Space (Chido blood group) Slc2a3 2.810697868 2.17E−13 4.56E−11 solutecarrier family Plasma transporter 2 (facilitated glucose Membranetransporter), member 3 Wt1 2.82046663 0.000169678 0.002498782 Wilmstumor 1 Nucleus transcription regulator 1300002K09Rik 2.8342745087.10E−07 2.29E−05 #N/A #N/A #N/A Vat1l 2.842318796 6.74E−05 0.001138678vesicle amine Other enzyme transport 1-like Il1b 2.842564699 3.03E−243.76E−21 interleukin 1, beta Extracellular cytokine Space Gjb32.850664786 2.13E−06 6.01E−05 gap junction protein, Plasma transporterbeta 3, 31 kDa Membrane Sfrp4 2.873431041 5.56E−14 1.35E−11 secretedfrizzled- Plasma transmembrane related protein 4 Membrane receptor Osbp22.894267579 8.23E−11 8.28E−09 oxysterol binding Cytoplasm other protein2 Serpina3i 2.896733041 3.59E−11 4.13E−09 serine (or cysteine) Otherother peptidase inhibitor, clade A, member 3G Ccbe1 2.899591596 1.80E−145.13E−12 collagen and Extracellular other calcium binding Space EGFdomains 1 Dnase1l3 2.914095612 4.71E−11 5.03E−09 deoxyribonuclease I-Nucleus enzyme like 3 Prg4 2.920652095 1.48E−13 3.24E−11 proteoglycan 4Extracellular other (megakaryocyte Space stimulating factor, articularsuperficial zone protein) Serpine1 2.943112329 2.65E−13 5.41E−11 serpinpeptidase Extracellular other inhibitor, clade E Space (nexin,plasminogen activator inhibitor type 1), member 1 Nfasc 2.9553860814.40E−11 4.81E−09 neurofascin Plasma other Membrane Tnfsf8 2.9580980257.42E−08 3.31E−06 tumor necrosis Plasma cytokine factor (ligand)Membrane superfamily, member 8 Adra2a 2.963089237 3.96E−36 1.36E−32adrenoceptor alpha Plasma G-protein 2A Membrane coupled receptor Syt52.967862699 1.03E−09 8.32E−08 synaptotagmin V Cytoplasm transporter Erv32.976253995 1.57E−05 0.000337967 endogenous Other other retroviralsequence 3 Lgi2 2.980280668 7.82E−07 2.49E−05 leucine-rich repeatExtracellular other LGI family, member Space 2 Adcy5 2.9897455481.35E−15 5.42E−13 adenylate cyclase 5 Plasma enzyme Membrane Lcn23.00569291 4.88E−09 3.09E−07 lipocalin 2 Extracellular transporter SpaceSyt17 3.012074261 1.31E−06 3.93E−05 synaptotagmin XVII Plasma otherMembrane Efemp1 3.013217113 2.56E−07 9.39E−06 EGF containingExtracellular enzyme fibulin-like Space extracellular matrix protein 1Fam5c 3.02276903 1.82E−07 7.05E−06 #N/A #N/A #N/A Sorcs1 3.0453556881.60E−05 0.000340968 sortilin-related Plasma transporter VPS 10 domainMembrane containing receptor 1 Adamts15 3.054035878 7.61E−15 2.54E−12ADAM Extracellular peptidase metallopeptidase Space with thrombospondintype 1 motif, 15 Clec2e 3.07635781 6.84E−06 0.000163535 C-type lectindomain Plasma transmembrane family 2, member h Membrane receptor Chl13.084606232 5.81E−07 1.92E−05 cell adhesion Plasma other moleculeL1-like Membrane Mmrn1 3.110168932 0.000104343 0.001656047 multimerin 1Extracellular other Space Gpr35 3.118726492 3.52E−19 2.84E−16 Gprotein-coupled Plasma G-protein receptor 35 Membrane coupled receptorRarres2 3.147276259 1.10E−09 8.87E−08 retinoic acid Plasma transmembranereceptor responder Membrane receptor (tazarotene induced) 2 Pgf3.150550679 8.07E−17 4.09E−14 placental growth Extracellular growthfactor factor Space Serpina3f 3.155375641 5.61E−14 1.35E−11 serine (orcysteine) Other other peptidase inhibitor, clade A, member 3G Il1r23.155818097 9.76E−15 3.11E−12 interleukin 1 Plasma transmembranereceptor, type II Membrane receptor Il13ra2 3.208712372 4.27E−050.000786196 interleukin 13 Plasma transmembrane receptor, alpha 2Membrane receptor Nxph4 3.212429551 1.20E−08 6.76E−07 neurexophilin 4Extracellular other Space Slit1 3.221521866 8.08E−08 3.56E−06 slitguidance ligand Extracellular other 1 Space Col10a1 3.255299689 8.78E−083.80E−06 collagen, type X, Extracellular other alpha 1 Space Grem13.306998841 4.81E−09 3.07E−07 gremlin 1, DAN Extracellular other familyBMP Space antagonist Rpl21 3.319158494 0.000321595 0.004226455 ribosomalprotein Cytoplasm other L21 Ly6k 3.330210263 1.32E−05 0.000292565lymphocyte antigen Nucleus other 6 complex, locus K Pcsk9 3.3436628721.11E−05 0.000249753 proprotein Extracellular peptidase convertase Spacesubtilisin/kexin type 9 Dbx2 3.374124712 8.16E−10 6.85E−08 developingbrain Nucleus transcription homeobox 2 regulator B3galt5 3.428879143.17E−06 8.47E−05 UDP- Cytoplasm enzyme Gal:betaGlcNAc beta 1,3-galactosyltransferase , polypeptide 5 Il11 3.446479515 1.36E−08 7.48E−07interleukin 11 Extracellular cytokine Space Htr1b 3.47009247 2.52E−146.90E−12 5- Plasma G-protein hydroxytryptamine Membrane coupled(serotonin) receptor receptor 1B, G protein- coupled Cxcl13 3.5547993873.92E−05 0.000737235 chemokine (C-X-C Extracellular cytokine motif)ligand 13 Space 9330182L06Rik 3.599154487 3.76E−06 9.79E−05KIAA1324-like Other other Cd207 3.698500979 6.80E−11 7.05E−09 CD207molecule, Plasma other langerin Membrane Serpina3n 3.699329372 1.12E−132.51E−11 serpin peptidase Extracellular other inhibitor, clade A Space(alpha-1 antiproteinase, antitrypsin), member 3 Tmem132e 3.7067364269.24E−08 3.94E−06 transmembrane Other other protein 132E Serpina3m3.722226604 5.48E−18 3.57E−15 serpin peptidase Extracellular otherinhibitor, clade A Space (alpha-1 antiproteinase, antitrypsin), member 3Kcnmb1 3.771552594 2.30E−08 1.15E−06 potassium channel Plasma ionchannel subfamily M Membrane regulatory beta subunit 1 Gpr1413.898353543 1.01E−10 9.98E−09 G protein-coupled Plasma G-proteinreceptor 141 Membrane coupled receptor Arg1 3.924911517 2.76E−081.37E−06 arginase 1 Cytoplasm enzyme Tpsab1 3.958781996 9.04E−152.94E−12 tryptase alpha/beta 1 Nucleus peptidase Ereg 3.9936044169.00E−07 2.82E−05 epiregulin Extracellular growth factor Space Mmp134.025132705 3.95E−15 1.42E−12 matrix Extracellular peptidasemetallopeptidase 13 Space Tnfrsf9 4.100043244 1.10E−24 1.67E−21 tumornecrosis Plasma transmembrane factor receptor Membrane receptorsuperfamily, member 9 Slc7a11 4.123064166 1.29E−17 7.68E−15 solutecarrier family Plasma transporter 7 (anionic amino Membrane acidtransporter light chain, xc- system), member 11 Akr1c18 4.1339602731.07E−11 1.40E−09 aldo-keto reductase Cytoplasm enzyme family 1, memberC3 Mgarp 4.215853911 3.91E−11 4.38E−09 mitochondria- Cytoplasm otherlocalized glutamic acid-rich protein Serpina3k 4.258478095 6.90E−131.26E−10 serpin peptidase Extracellular other inhibitor, clade A Space(alpha-1 antiproteinase, antitrypsin), member 3 Ccl20 4.3256578411.88E−10 1.80E−08 chemokine (C-C Extracellular cytokine motif) ligand 20Space Cfi 4.589933583 1.17E−09 9.20E−08 complement factor IExtracellular peptidase Space Reg3g 4.66412117 1.46E−12 2.47E−10regenerating islet- Extracellular other derived 3 gamma Space Krt194.779241895 1.21E−05 0.000271903 keratin 19, type I Cytoplasm otherPtprn 4.824685996 6.13E−22 6.98E−19 protein tyrosine Plasma phosphatasephosphatase, Membrane receptor type, N A2m 4.936644365 1.29E−07 5.33E−06alpha-2- Extracellular transporter macroglobulin Space Saa3 4.9381905546.34E−08 2.88E−06 serum amyloid A 3 Extracellular other Space Gzme5.281856145 6.57E−14 1.52E−11 granzyme H Cytoplasm peptidase (cathepsinG-like 2, protein h-CCPX) Mmp3 5.447660714 4.84E−28 8.28E−25 matrixExtracellular peptidase metallopeptidase 3 Space Prokr2 5.9033135542.25E−14 6.29E−12 prokineticin receptor Plasma G-protein 2 Membranecoupled receptor Fgf23 6.223273913 4.42E−14 1.14E−11 fibroblast growthExtracellular growth factor factor 23 Space Mcpt2 6.857304981 4.93E−301.12E−26 mast cell protease 2 Extracellular peptidase Space Gzmd7.248393542 7.17E−13 1.29E−10 granzyme H Cytoplasm peptidase (cathepsinG-like 2, protein h-CCPX) Cldn10 7.636808366 5.34E−09 3.35E−07 claudin10 Plasma other Membrane Mmp10 7.64543229 2.07E−24 2.84E−21 matrixExtracellular peptidase metallopeptidase 10 Space Gm9992 7.6486649194.61E−08 2.16E−06 unc-93 homolog A Plasma other (C. elegans) MembraneMcpt8 8.11716942 5.31E−12 7.57E−10 mast cell protease 8 Cytoplasm otherReg1 10.74685846 8.53E−19 6.49E−16 regenerating islet- Extracellulargrowth factor derived 1 alpha Space Mcpt1 11.25382227 2.86E−47 3.92E−43mast cell protease 1 Other peptidase

Various aspects of the disclosure have been described. These and otheraspects are within the scope of the following claims.

What is claimed is:
 1. A method comprising: xenografting tissue from adonor organism of a second species on to a host organism of a firstspecies; obtaining a sample derived from the host organism, wherein thesample comprises a plurality of molecules of messenger ribonucleic acid(mRNA); determining, for substantially each molecule of the plurality ofmolecules of mRNA, a corresponding RNA sequence; generating a combineddataset of RNA sequence reads by aligning each RNA sequence to acombined reference genome, wherein aligning each RNA sequence to thecombined reference genome includes comparing a genomic location of eachcorresponding RNA sequence with a genomic location of a gene sequence ofthe combined reference genome, wherein the combined reference genomeincludes one or more gene sequences from at least a portion of a firstgenome derived from the first species and at least a portion of a secondgenome derived from the second species, and wherein the combined datasetof RNA sequence reads includes RNA sequence reads from both the firstspecies and the second species; filtering non-unique RNA sequence readsfrom the combined dataset by identifying species-specific RNA sequencesexclusive to either the first species or the second species, whereinidentifying species-specific RNA sequences includes determining, foreach corresponding RNA sequence, whether the RNA sequence issubstantially aligned with exactly one corresponding gene sequence ofthe combined reference genome; at least one of: differentiating anorigin species of each species-specific RNA sequence in the filteredcombined dataset by determining whether the corresponding RNA sequenceis aligned to a gene sequence of the combined reference genomeassociated with the first genome of the first species or the secondgenome of the second species, or quantifying an abundance level of eachspecies-specific RNA sequence in the sample by determining anapproximate number of times that each RNA sequence substantially alignedwith exactly one corresponding gene occurs in the sample; anddetermining, based on one or more of the differentiation of the originspecies or the quantification of the abundance level, that the tissuederived from the donor organism contains a biomarker indicative of atleast one of: a disease status, a response of the host organism to thetissue derived from the donor organism, a response of tissue derivedfrom the donor organism to transplantation within the host organism, ora response of the host organism to therapy administered to the hostorganism.
 2. The method of claim 1, wherein the one or more genesequences comprise at least one of one or more coding sequences or oneor more regulatory sequences.
 3. The method of claim 1, furthercomprising generating the combined reference genome.
 4. The method ofclaim 3, wherein generating the combined reference genome comprises:identifying, for each of the one or more gene sequences, a correspondinglocation within the combined reference genome; and annotating, for eachof the one or more gene sequences, the corresponding location indicatesthe origin species of the corresponding gene sequence.
 5. The method ofclaim 1, further comprising, for each species-specific RNA sequence,determining that the exactly one corresponding gene sequence isassociated with a predetermined cluster of gene sequences.
 6. The methodof claim 5, wherein the predetermined cluster of gene sequencescomprises a group of genes sharing one or more functionalcharacteristics.
 7. The method of claim 6, wherein the one or morefunctional characteristics comprises one or more biological processes orcanonical pathways.
 8. The method of claim 7, wherein the one or morebiological processes or functional characteristics comprise one or moreof transcriptional regulation, intracellular signaling, intercellularsignaling, cell apoptosis, biomolecule metabolism, biomoleculesynthesis, RNA processing, or macromolecule assembly.
 9. The method ofclaim 1, wherein the donor organism contains a biomarker indicative ofthe disease status, and wherein the biomarker comprises a nucleic acidsequence associated with a disease.
 10. The method of claim 1, whereinthe donor organism contains a biomarker indicative of the diseasestatus, and wherein the disease status comprises at least one of: thepresence or absence of a disease state, one or more characteristics ofan existing disease state, a likelihood of a future progression of anexisting disease state, or one or more characteristics of a predictedfuture progression of an existing disease state.
 11. The method of claim1, further comprising determining, based on determining that the tissuederived from the donor organism contains the biomarker indicative of thedisease status, a therapy to be administered to at least one of the hostorganism or the donor organism.
 12. The method of claim 11, furthercomprising administering the determined therapy to the at least one ofthe host organism or the donor organism.
 13. The method of claim 1,wherein the donor organism contains a biomarker indicative of a responseto the tissue derived from the donor organism, and wherein the responseof the host organism to the tissue derived from the donor organismcorresponds to one of acceptance or rejection of the tissue derived fromthe donor organism by the host organism.
 14. The method of claim 1,wherein obtaining the sample of bodily fluid derived from the hostorganism comprises: obtaining a sample of blood; isolating, from thesample of blood, a volume of blood serum; and isolating, from the volumeof blood serum, a plurality of exosomes.
 15. The method of claim 1,further comprising: isolating, from the sample of bodily fluid, theplurality of molecules of mRNA; purifying the molecules of mRNA;performing a reverse-transcriptase polymerase chain reaction using themolecules of RNA to produce a plurality of molecules of complementarydeoxyribonucleic acid (cDNA), wherein each molecule of the plurality ofmolecules of cDNA corresponds to one of the plurality of molecules ofmRNA; performing a polymerase chain reaction to amplify the molecules ofcDNA; transcribing substantially each of the molecules of cDNA into RNA;and determining the nucleic acid sequence of substantially each of themolecules of mRNA.
 16. The method of claim 1, wherein the first speciescomprises one of a rodent species or a non-human primate species, andwherein the second species comprises one of a canine, feline, porcine,or human species.
 17. A method comprising: xenografting tissue from adonor organism of a second species on to a host organism of a firstspecies; obtaining a sample derived from the host organism wherein thesample comprises a plurality of molecules of messenger ribonucleic acid(mRNA); generating a combined reference genome, wherein the combinedreference genome comprises one or more gene sequences from: at least aportion of a first genome derived from the first species and at least aportion of the second genome derived from the second species;determining, for substantially each molecule of the plurality ofmolecules of mRNA, a corresponding RNA sequence; generating a combineddataset of RNA sequence reads by aligning each RNA sequence to thecombined reference genome, wherein aligning each RNA sequence to thecombined reference genome includes comparing a genomic location of eachcorresponding RNA sequence with a genomic location of a gene sequence ofthe combined reference genome, and wherein the combined dataset of RNAsequence reads includes RNA sequence reads from both the first speciesand the second species; filtering non-unique RNA sequence reads from thecombined dataset by identifying species-specific RNA sequences exclusiveto either the first species or the second species, wherein identifyingspecies-specific RNA sequences includes determining, for eachcorresponding RNA sequence, whether the RNA sequence is substantiallyaligned with exactly one corresponding gene sequence of the combinedreference genome.
 18. The method of claim 17, wherein generating thecombined reference genome further comprises: identifying, for each ofthe one or more gene sequences of the combined reference genome, acorresponding location within the combined reference genome; andannotating, for each of the one or more gene sequences of the combinedreference genome, the corresponding location to indicate the originspecies of the corresponding gene sequence.
 19. The method of claim 17,wherein generating the combined reference genome further comprises:receiving data indicating gene sequences of at least a portion of thefirst genome derived from the first species; receiving data indicatinggene sequences of at least a portion of the second genome derived fromthe second species; and outputting one or more computer filesrepresenting the one or more gene sequences of the combined referencegenome.
 20. The method of claim 17, further comprising at least one of:differentiating an origin species of each species-specific RNA sequencein the filtered combined dataset by determining whether thecorresponding RNA sequence is aligned to a gene sequence of the combinedreference genome associated with the first genome of the first speciesor the second genome of the second species, or quantifying an abundancelevel of each species-specific RNA sequence in the sample by determiningan approximate number of times that each RNA sequence substantiallyaligned with exactly one corresponding gene occurs in the sample.